|
|
ec91904481
|
refactor(dataset creation): update API methods to return only posts
|
2026-02-09 21:20:08 +00:00 |
|
|
|
152264bda9
|
separate comment and post data structures
This allows for a flat data structure, benefical to data analysis
|
2026-01-22 15:53:47 +00:00 |
|
|
|
096a415f3b
|
fix datetime from boards.ie not being parsed properly
|
2026-01-22 14:49:01 +00:00 |
|
|
|
85388ef6aa
|
Add comment limit to _parse_comments method in BoardsAPI
Some boards.ie threads have thousands of comments which is slow to fetch with pagination
|
2026-01-19 20:23:11 +00:00 |
|
|
|
e9cf51731d
|
Add comment parsing functionality to BoardsAPI
Pagination required due to multiple pages of comments on boards.
|
2026-01-19 18:24:44 +00:00 |
|
|
|
4ea9bc8b45
|
Increase max_workers in ThreadPoolExecutor to improve post fetching performance
|
2026-01-17 22:14:34 +00:00 |
|
|
|
db21e86b8e
|
Fix post ID extraction in _parse_thread method
|
2026-01-17 16:18:04 +00:00 |
|
|
|
ed3d89fd27
|
Refactor post fetching to use ThreadPoolExecutor for improved concurrency
|
2026-01-17 16:05:37 +00:00 |
|
|
|
d5e6b7a895
|
Refactor post detail fetching into separate _parse_thread method
|
2026-01-17 14:51:57 +00:00 |
|
|
|
b8ed409e04
|
implement slight efficiency gain in board.ie pagination
|
2026-01-17 14:43:14 +00:00 |
|
|
|
0523c1a091
|
Refactor logging to use class logger in BoardsAPI
|
2026-01-17 14:37:28 +00:00 |
|
|
|
a1c1e1e0d8
|
patch broken title scrape
|
2026-01-17 14:28:16 +00:00 |
|
|
|
9eec7b00e3
|
Implement BoardsAPI to fetch new category posts and their details
|
2026-01-17 14:25:43 +00:00 |
|