crosspost

Author	SHA1	Message	Date
Dylan De Faoite	c990f29645	fix(frontend): misaligned loading page for datasets	2026-03-14 17:05:46 +00:00
Dylan De Faoite	8a423b2a29	feat(connectors): implement category validation in scraping process	2026-03-14 16:59:43 +00:00
Dylan De Faoite	d96f459104	fix(connectors): update URL references to use base_url in BoardsAPI	2026-03-13 21:59:17 +00:00
Dylan De Faoite	162a4de64e	fix(frontend): detects which sources support category or search	2026-03-12 10:07:28 +00:00
Dylan De Faoite	6684780d23	fix(connectors): add stronger validation to scrape endpoint Strong validation needed, otherwise data goes to Celery and crashes silently. In addition it checks if that specific source supports search or category.	2026-03-12 09:59:07 +00:00
Dylan De Faoite	c12f1b4371	chore(connectors): add category and search validation fields	2026-03-12 09:56:34 +00:00
Dylan De Faoite	01d6bd0164	fix(connectors): category / search fields breaking Ideally category and search are fully optional, however some sites break if one or the other is not provided. Unfortuntely `boards.ie` has a different page type for searches and I'm not bothered to implement a scraper from scratch. In addition, removed comment limit options.	2026-03-11 21:16:26 +00:00
Dylan De Faoite	12cbc24074	chore(utils): remove `split_limit` function	2026-03-11 19:47:44 +00:00
Dylan De Faoite	0658713f42	chore: remove unused dataset creation script	2026-03-11 19:44:38 +00:00
Dylan De Faoite	b2ae1a9f70	feat(frontend): add page for scraping endpoint	2026-03-11 19:41:34 +00:00
Dylan De Faoite	eff416c34e	fix(connectors): hardcoded source name in Youtube connector	2026-03-10 23:36:09 +00:00
Dylan De Faoite	524c9c50a0	fix(api): incorrect dataset status update message	2026-03-10 23:28:21 +00:00
Dylan De Faoite	2ab74d922a	feat(api): support per-source search, category and limit configuration	2026-03-10 23:15:33 +00:00
Dylan De Faoite	d520e2af98	fix(auth): missing email and username business rules	2026-03-10 22:48:04 +00:00
Dylan De Faoite	8fe84a30f6	fix: data leak when opening topics file	2026-03-10 22:45:07 +00:00
Dylan De Faoite	dc330b87b9	fix(celery): process dataset directly in fetch task Calling the original `process_dataset` function led to issues with JSON serialisation.	2026-03-10 22:17:00 +00:00
Dylan De Faoite	7ccc934f71	build: change celery to debug mode	2026-03-10 22:14:45 +00:00
Dylan De Faoite	a3dbe04a57	fix(frontend): option to delete dataset not shown after fail	2026-03-10 19:23:48 +00:00
Dylan De Faoite	a65c4a461c	fix(api): flask delegates dataset fetch to celery	2026-03-10 19:17:41 +00:00
Dylan De Faoite	15704a0782	chore(db): update db schema to include "fetching" status	2026-03-10 19:17:08 +00:00
Dylan De Faoite	6ec47256d0	feat(api): add database scraping endpoints	2026-03-10 19:04:33 +00:00
Dylan De Faoite	2572664e26	chore(utils): add env getter that fails if env not found	2026-03-10 18:50:53 +00:00
Dylan De Faoite	17bd4702b2	fix(connectors): connector detectors returning name of ID alongside connector obj	2026-03-10 18:36:40 +00:00
Dylan De Faoite	53cb5c2ea5	feat(topics): add generalised topic list This is easier and quicker compared to deriving a topics list based on the dataset that has been scraped. While using LLMs to create a personalised topic list based on the query, category or dataset itself would yield better results for most, it is beyond the scope of this project.	2026-03-10 18:36:08 +00:00
Dylan De Faoite	0866dda8b3	chore: add util to always split evenly	2026-03-10 18:25:05 +00:00
Dylan De Faoite	5ccb2e73cd	fix(connectors): incorrect registry location Registry paths were using the incorrect connector path locations.	2026-03-10 18:18:42 +00:00
Dylan De Faoite	2a8d7c7972	refactor(connectors): Youtube & Reddit connectors implement BaseConnector	2026-03-10 18:11:33 +00:00
Dylan De Faoite	e7a8c17be4	chore(connectors): add base connector inheritance	2026-03-10 18:08:01 +00:00
Dylan De Faoite	cc799f7368	feat(connectors): add base connector and registry for detection Idea is to have a "plugin-type" system, where new connectors can extend the `BaseConnector` class and implement the fetch posts method. These are automatically detected by the registry, and automatically used in new Flask endpoints that give a list of possible sources. Allows for an open-ended system where new data scrapers / API consumers can be added dynamically.	2026-03-09 21:29:03 +00:00
Dylan De Faoite	262a70dbf3	refactor(api): rename /upload endpoint Ensures consistency with the other dataset-based endpoints and follows the REST-API rules more cleanly.	2026-03-09 20:55:12 +00:00
Dylan De Faoite	ca444e9cb0	refactor: move connectors to backend dir They will now be more used in the backend.	2026-03-09 20:53:13 +00:00
Dylan De Faoite	738af5415b	Merge pull request 'Editable and removable datasets' (#8 ) from feat/editable-datasets into main Reviewed-on: #8	2026-03-05 16:55:48 +00:00
Dylan De Faoite	2b14a8a417	feat(frontend): add deletion modal confirmation box	2026-03-05 12:29:53 +00:00
Dylan De Faoite	a154b25415	fix(db): missing rollback on execute_batch method Arguably more important on a batch function to have rollback.	2026-03-05 10:09:14 +00:00
Dylan De Faoite	eb273efe61	Merge remote-tracking branch 'origin/main' into feat/editable-datasets	2026-03-04 22:34:55 +00:00
Dylan De Faoite	a9001c79e1	build: add frontend to main docker compose Forgot to add this earlier	2026-03-04 22:34:32 +00:00
Dylan De Faoite	eec8f2417e	feat(frontend): add ability to delete datasets	2026-03-04 22:32:19 +00:00
Dylan De Faoite	f5835b5a97	feat(frontend): add frontend option to change name	2026-03-04 22:17:31 +00:00
Dylan De Faoite	64e3f9eea8	feat: implement PATCH dataset route At the moment only allows for the updating of the name. Which seems to be the only editable part of dataset metadata.	2026-03-04 21:38:06 +00:00
Dylan De Faoite	4f01bf0419	fix(db): incorrect SQL condition when deleting dataset content	2026-03-04 21:35:10 +00:00
Dylan De Faoite	6948891677	Merge remote-tracking branch 'origin/main' into feat/editable-datasets	2026-03-04 21:30:13 +00:00
Dylan De Faoite	f1f33e2fe4	feat: implement delete dataset route	2026-03-04 21:29:01 +00:00
Dylan De Faoite	e20d0689e8	fix(celery): adjust try-catch logic to improve error handling Capturing the instantiation of the database and dataset manager objects inside the try-catch will cause errors if something else fails. If an exception occurs and the dataset_manager is not initialised, the code inside the catch block will fail.	2026-03-04 21:18:59 +00:00
Dylan De Faoite	fcdac6f3bb	Merge pull request 'Fix the frontend API calls and implement logins on frontend' (#7 ) from feat/update-frontend-api-calls into main Reviewed-on: #7	2026-03-04 20:20:50 +00:00
Dylan De Faoite	5fc1f1532f	feat(user stats): updated styling and stats in user page Interaction graph was taking up too much space and was the only thing on the screen. Further statistics were added however these may be removed in favour of more informative statistics	2026-03-04 20:20:34 +00:00
Dylan De Faoite	24277e0104	fix(frontend): move loading card higher up Looks weird lower down on the screen	2026-03-04 20:09:55 +00:00
Dylan De Faoite	4e99b77492	fix(db): missing post ID in db schema Caused surprisingly little errors. It only broke the interaction graph.	2026-03-04 20:05:20 +00:00
Dylan De Faoite	b6815c490a	feat: add loading page for when dataset is loading Originally there was a simple "Loading" text, however this looked bad and might lead a user to think that the page had frozen. There is now a more comprehensive loading animation which users might be happy to sit on for a few minutes.	2026-03-04 18:39:20 +00:00
Dylan De Faoite	29c90ddfff	feat: update name on topbar Crosspost Analysis Engine sounds far cooler than "Ethnograph View"	2026-03-04 18:37:48 +00:00
Dylan De Faoite	3fe08b9c67	fix(backend): buggy reply_time_by_emotion metric This metric was never stastically significant and held no real value. It also so happened to hold accidental NaN values in the dataframe which broke the frontend. Happy to remove.	2026-03-04 18:37:11 +00:00

1 2 3 4 5 ...

316 Commits