Data Science
Data Science 101: Don’t be Duped By Duplicate Data
Having followed the instructions in the previous article in this series we have downloaded Project Gutenberg’s entire collection of English texts and now have a local repository of approximately 80,000 text files. But something doesn’t smell right here. How can we have this many files when Project Gutenberg only claims to contain Read more…