No description
Find a file
2023-11-12 12:07:25 +08:00
evaluation punctuationsDensity 2023-11-11 00:07:13 +08:00
src handle article body 2023-11-11 19:10:52 +08:00
tests A new start 2023-11-10 14:48:41 +08:00
.gitignore A new start 2023-11-10 14:48:41 +08:00
README.md readme 2023-11-12 12:07:25 +08:00
webpage_extractors.nimble A new start 2023-11-10 14:48:41 +08:00

webpage_extractors

web page html content extractors

The goal is providing serveral extractors and compare their performance.

Note: Under development, Apis can be changed at any time.

Apis

Basic content extractor, no need for language detection and stop words.

proc extractContentBasic*(s: string, textOnly = false): string =