SmartReader

A .NET Standard library to extract the main content of a web page.

Remove the clutter

SmartReader gives you a clean article without ads, sidebars, etc. Available both as HTML and lightly formatted text.

Useful metadata

SmartReader can (usually) find all the metadata you need: author, publication date, site name, language, the excerpt of the article, the featured image, a list of images found (it can optionally also download them and store as data URI), an estimate of the time needed to read the article.

Well tested algorithm

The core algorithm is a port of the Readability library, used in Firefox by millions of people.

Table of Contents

SmartReader

Remove the clutter

Useful metadata

Well tested algorithm