Project Fletcher

week 7 - week 8
Design Individual or pair brainstorming
Data Text data; APIs; NoSQL Databases
Modeling Unsupervised Learning; Natural Language Processing; Approximate Nearest Neighbors
Tools MongoDB; Flask; tweepy; nltk; gensim; scikit-learn
Communication Online Dashboard or Blog Post

Synopsis:

Project Fletcher involves analyzing text data with one or more of Natural Language Processing techniques. It is a two week project, and students have the choice to make it the first step of their passion project (which will come to full fruition during Project Kojak).

###How it works: Students choose a question to answer related to natural language processing, and gather text data to this end. API sources are preferred but not enforced.
Students are strongly encouraged to work in pairs for the text gathering portion of this project. Especially if web scraping is involved rather than or in addition to API sources, pair programming will tend to speed up the data gathering and preparation.
Techniques they learn and will apply include clustering algorithms, keyword extraction, topic models, approximate nearest neighbor algorithms and NoSQL databases.
The end deliverable will be either a blog post that communicates their findings or an interactive dashboard, depending on what fits the project best. If there is demand, we can organize small group presentations (just like with Project McNulty), to practice presentation skills.

Data:

While the data source is left to the choice of each student or pair of students, there are two constraints: - The project must involve text data - Students should use mongoDB to store and query the data


Timeline: