37d08c63b8
chore: rename auto-scraper to auto-fetcher
...
Improves the perception of ethics
2026-04-01 09:50:53 +01:00
cd6030a760
fix(ngrams): remove stop words from ngrams
2026-04-01 08:44:47 +01:00
b270ed03ae
feat(frontend): implement corpus explorer
...
This allows you to view the posts & comments associated with a specific aggregate.
2026-04-01 00:04:25 +01:00
75fd042d74
feat(api): add support for custom topic lists when autoscraping
2026-03-31 13:36:37 +01:00
376773a0cc
style: run python linter & prettifier on backend code
2026-03-25 19:34:43 +00:00
8372aa7278
feat(api): add endpoint to view entire dataset
2026-03-17 13:36:41 +00:00
3468fdc2ea
feat(api): add new user and linguistic endpoints
2026-03-16 16:45:11 +00:00
12f5953146
fix(api): remove error exceptions in API responses
...
Mainly a security thing, we don't want actual code errors being given in the API response, as someone could find out how the inner workings of the code behaves.
2026-03-14 21:58:00 +00:00
d2b919cd66
fix(api): enforce integer limit and cap at 1000 in scrape_data function
2026-03-14 17:35:05 +00:00
062937ec3c
fix(api): incorrect validation on search
2026-03-14 17:12:02 +00:00
8a423b2a29
feat(connectors): implement category validation in scraping process
2026-03-14 16:59:43 +00:00
6684780d23
fix(connectors): add stronger validation to scrape endpoint
...
Strong validation needed, otherwise data goes to Celery and crashes silently. In addition it checks if that specific source supports search or category.
2026-03-12 09:59:07 +00:00
12cbc24074
chore(utils): remove split_limit function
2026-03-11 19:47:44 +00:00
524c9c50a0
fix(api): incorrect dataset status update message
2026-03-10 23:28:21 +00:00
2ab74d922a
feat(api): support per-source search, category and limit configuration
2026-03-10 23:15:33 +00:00
8fe84a30f6
fix: data leak when opening topics file
2026-03-10 22:45:07 +00:00
a65c4a461c
fix(api): flask delegates dataset fetch to celery
2026-03-10 19:17:41 +00:00
6ec47256d0
feat(api): add database scraping endpoints
2026-03-10 19:04:33 +00:00
5ccb2e73cd
fix(connectors): incorrect registry location
...
Registry paths were using the incorrect connector path locations.
2026-03-10 18:18:42 +00:00
cc799f7368
feat(connectors): add base connector and registry for detection
...
Idea is to have a "plugin-type" system, where new connectors can extend the `BaseConnector` class and implement the fetch posts method.
These are automatically detected by the registry, and automatically used in new Flask endpoints that give a list of possible sources.
Allows for an open-ended system where new data scrapers / API consumers can be added dynamically.
2026-03-09 21:29:03 +00:00
262a70dbf3
refactor(api): rename /upload endpoint
...
Ensures consistency with the other dataset-based endpoints and follows the REST-API rules more cleanly.
2026-03-09 20:55:12 +00:00
f5835b5a97
feat(frontend): add frontend option to change name
2026-03-04 22:17:31 +00:00
64e3f9eea8
feat: implement PATCH dataset route
...
At the moment only allows for the updating of the name. Which seems to be the only editable part of dataset metadata.
2026-03-04 21:38:06 +00:00
f1f33e2fe4
feat: implement delete dataset route
2026-03-04 21:29:01 +00:00
207c4b67da
feat(frontend): add dataset name requirements to the upload page
2026-03-03 17:28:46 +00:00
772205d3df
feat(api): add ability to fetch all datasets by a user
2026-03-03 17:25:00 +00:00
5310568631
feat: add React layout and a topbar allowing for easy logins
2026-03-03 17:17:57 +00:00
eb4187c559
feat(api): add status returns for NLP processing
2026-03-03 13:46:37 +00:00
075e1fba85
fix: typo in exception naming
2026-03-03 13:12:28 +00:00
3a58705635
feat: add celery & redis for background data processing
2026-03-03 12:27:14 +00:00
6248b32ce2
refactor: move app.py into main server dir
2026-03-03 11:14:51 +00:00
87bdc0245a
refactor: move core files into separate dirs
2026-03-03 11:13:33 +00:00
36bede42d9
style: clean up imports
2026-03-03 11:08:56 +00:00
4bec0dd32c
refactor: extract dataset functionality out of db class
2026-03-02 19:18:05 +00:00
4961ddc349
refactor: move db dir into server
2026-03-02 19:05:56 +00:00
c9151da643
feat: add custom error for non-existent dataset
2026-03-02 18:59:31 +00:00
18c8539646
fix: server error when attmepting to access non-existant dataset
2026-03-02 18:55:27 +00:00
5ea71023b5
refactor: move query parameter extraction function out of flask app
2026-03-02 18:29:09 +00:00
37cb2c9ff4
feat(querying): make filters stateless
...
Stateless filters are required as the server cannot store them in the StatGen object
2026-03-02 16:18:02 +00:00
82a98f84bd
refactor: combine query results into one endpoint
2026-03-01 19:06:49 +00:00
7ddd625bf8
fix: database schema missing type column
2026-03-01 16:40:00 +00:00
07ab7529a9
refactor: update analysis classes to accept DataFrame as parameter instead of instance variable
2026-03-01 16:25:39 +00:00
d20790ed4b
fix: incorrect dataset authorisation check
2026-03-01 16:10:42 +00:00
d3c4d883be
Merge branch 'auth-test' of gitea:dylan/crosspost into auth-test
2026-03-01 16:01:48 +00:00
5fb7710dc2
feat: dataset now persists to database
2026-03-01 16:01:15 +00:00
d73f4f1c45
Merge branch 'main' into auth-test
2026-02-25 08:59:32 +00:00
6695d3d272
refactor: improve API wording & cleanup code
2026-02-24 15:55:56 +00:00
be6ab1f929
feat: add profile endpoint to view user details
2026-02-23 22:43:55 +00:00
3165bf1aa9
feat: add login endpoint
2026-02-23 22:40:26 +00:00
0589b2c8a5
feat: add /register endpoint
2026-02-23 22:27:32 +00:00