ScrapeOpen is a collaborative project. It curates the publication of code to scrape websites serving public data sets and presents them in a format accessible to humans and machines.
A public data set is a collection of observations published by a public agency (not-for-profit) and having public value. In order to be published, the data set must not be proprietary.
Data sets published by ScrapeOpen project have 3 stars in the 5-star Open Data plan:
★ make your stuff available on the Web (whatever format) under an open license ★★ make it available as structured data (e.g., Excel instead of image scan of a table) ★★★ make it available in a non-proprietary open format (e.g., CSV instead of Excel) ★★★★ use URIs to denote things, so that people can point at your stuff ★★★★★ link your data to other data to provide context
That is, every data set is
Possibly, entities described in the data sets (e.g. people, places) are also linked to other data.
If you want to contribute, please have a look at the project documentation.