Commit Graph

371 Commits

Author SHA1 Message Date
98aa04256b fix(reddit_api): fix reddit ratelimit check 2026-04-04 10:20:48 +01:00
5f81c51979 docs(report): add scalability constraints 2026-04-03 20:06:19 +01:00
361b532766 docs(analysis): add feasability analysis 2026-04-03 20:02:22 +01:00
9ef96661fc report(analysis): update structure & add justifications 2026-04-03 18:35:08 +01:00
9375abded5 docs(design): add docker & async processing sections 2026-04-03 17:59:01 +01:00
74ecdf238a docs: add database schema diagram 2026-04-02 19:30:20 +01:00
b85987e179 docs: add system architecture diagram 2026-04-02 18:59:32 +01:00
1dde5f7b08 fix(nlp): fix missing processing dataset status update 2026-03-31 20:59:09 +01:00
a841c6f6a1 perf(stats): memoize derived state and reduce intermediate allocations 2026-03-31 20:15:07 +01:00
2045ccebb5 build(docker): update CMD to include host binding 2026-03-31 19:31:58 +01:00
efb4c8384d chore(stats): remove average_thread_depth 2026-03-31 16:40:54 +01:00
75fd042d74 feat(api): add support for custom topic lists when autoscraping 2026-03-31 13:36:37 +01:00
e776ef53ac refactor(database): configurable database source 2026-03-29 21:30:18 +01:00
f996b38fa5 fix(report): remove unicode char 2026-03-25 19:46:29 +00:00
6d8ae3e811 docs: add section on Topic Modelling in NLP 2026-03-25 19:44:14 +00:00
376773a0cc style: run python linter & prettifier on backend code 2026-03-25 19:34:43 +00:00
aae10c4d9d style: run prettifier plugin on entire frontend 2026-03-25 19:30:21 +00:00
8730af146d chore: remove main.py
Not used anymore.
2026-03-22 14:41:47 +00:00
7716ee0bff build(env): extract Redis URL into env file
This could allow one to connect to a remote Redis instance with a powerful GPU, allowing one to offload the NLP work.
2026-03-22 14:41:15 +00:00
97e897c240 fix(analysis): broken entity handling in cultural endpoint 2026-03-22 14:34:05 +00:00
c3762f189c build(docker): comment out GPU deployment configuration from worker service
While this works for NVIDIA GPUs, it breaks on a MacBook or any non-NVIDIA machine. I commented it out because it's still useful on these machines.
2026-03-22 13:34:51 +00:00
078716754c feat(report): add main.tex for project documentation and analysis 2026-03-21 23:54:42 +00:00
e43eae5afd fix(frontend): missing "fetching" status from auto-scrape
When auto-scraping, the dataset status page would say "Dataset Ready" when it was still fetching.
2026-03-21 22:49:16 +00:00
b537b5ef16 docs: update .gitignore 2026-03-21 19:24:51 +00:00
acc591ff1e Merge pull request 'Finish off the links between frontend and backend' (#10) from feat/add-frontend-pages into main
Reviewed-on: #10
2026-03-18 20:30:19 +00:00
e054997bb1 feat(frontend): reword CulturalStats to improve understandability 2026-03-18 19:23:35 +00:00
e5414befa7 feat(frontend): add dominant emotion display to UserModal 2026-03-18 19:12:25 +00:00
86926898ce feat(frontend): improve labels to be more understandable 2026-03-18 19:12:11 +00:00
b1177540a1 feat(frontend): enhance EmotionalStats component with detailed mood analysis 2026-03-18 19:11:18 +00:00
f604fcc531 feat(frontend): add warning message for scraping limits 2026-03-18 19:02:11 +00:00
b7aec2b0ea feat(frontend): add favicon
Credit goes to `srip` on flaticon for the image.
2026-03-18 19:00:31 +00:00
1446dd176d feat(frontend): center page selection 2026-03-18 18:53:14 +00:00
c215024ef2 feat(frontend): add deleted user filter
Reddit often contains "[Deleted]" when a user is banned or deletes their post/comment. Keeping the backend faithful to the original dataset is important so the filtering is being done on the frontend.
2026-03-18 18:50:51 +00:00
17ef42e548 feat!(frontend): add cultural, interactional and linguistic stat pages 2026-03-18 18:43:49 +00:00
7e4a91bb5e style(frontend): style api types to be in order of the endpoint 2026-03-18 18:40:39 +00:00
436549641f chore(frontend): add api types for new backend data 2026-03-18 18:37:39 +00:00
3e78a54388 feat(stat): add conversation concentration metric
Remove old `initiator_ratio` metric which wasn't working due every event having a `reply_to` value.

This metric was suggested by AI, and is a surprisingly interesting one that gave interesting insights.
2026-03-18 18:36:09 +00:00
71998c450e fix(db): change title type to text
Occasionally a Reddit post would have a long title, and would break in the schema.
2026-03-17 19:49:03 +00:00
2a00384a55 feat(interaction): add top interaction pairs and initiator ratio methods 2026-03-17 19:03:56 +00:00
8372aa7278 feat(api): add endpoint to view entire dataset 2026-03-17 13:36:41 +00:00
7b5a939271 fix(stats): missing private methods in User obj 2026-03-17 13:36:10 +00:00
2fa1dff4b7 feat(stat): add lexical diversity stat 2026-03-17 13:27:49 +00:00
31fb275ee3 fix(db): incorrect NER column being inserted 2026-03-17 12:53:30 +00:00
8a0f6e71e8 chore(api): rename cultural entity emotion endpoint 2026-03-17 12:31:53 +00:00
9093059d05 refactor(stats): move user stats out of interactional into users 2026-03-17 12:23:03 +00:00
8a13444b16 chore(frontend): add new API types 2026-03-16 16:46:07 +00:00
3468fdc2ea feat(api): add new user and linguistic endpoints 2026-03-16 16:45:11 +00:00
09a4f9036f refactor(stats): add summary and user stat classes for consistency 2026-03-16 16:43:24 +00:00
97fccd073b feat(emotional): add average emotion & dominant emotion stats 2026-03-16 16:41:28 +00:00
94befb61c5 Merge pull request 'Automatic Scraping of dataset options' (#9) from feat/automatic-scraping-datasets into main
Reviewed-on: #9
2026-03-14 21:58:49 +00:00