Problem Set 5: Project Proposal
Instructions: Submit the proposal as a PDF file on the following submission link
Project Proposal Guidelines
Submit a proposal (maximum 3 pages excluding references, and you can use double-columns if you want) that demonstrates your project’s feasibility and planning. While this is a proposal, you should demonstrate that you’ve started working on the project, you did some preliminary data analysis and machine learning experiments. Feel free to use a coding assistant to help you accomplish as much as possible in the coming week. To be more specific, here’s what you need to include:
1. Problem Motivation and Related Work: Before writing, review the project guidelines on the course website. Explain why your problem matters and provide a brief literature review with proper MLA citations. Focus on 3-5 key papers/projects that relate directly to your approach.
2. Dataset and Feasibility Analysis: Identify your dataset with a clear reference/link. Include basic statistical analysis showing things like: dataset size, feature dimensions, and target variable distribution. Show that you’ve successfully loaded and explored the data. This section should convince me that your approach is feasible given that specific data, even if you end up using a different dataset in your final project submission.
3. Proposed Methods and Initial Results: List at least 2-3 models you plan to test. Include results from at least one baseline model showing training/validation metrics. Specify whether you’d need GPU access or not. I strongly recommend starting with a simplified version that runs on your laptop before scaling up.
4. GitHub Repository: Create a public github repository with your initial code, baseline results, and a clear README with setup instructions. I expect to see regular commits from all members of the team throughout the semester showing consistent progress. If you’re not familiar with git, and for best practices, refer to Atlassian’s git tutorials for a good overview, and here’s a good tutorial. Feel free to share related resources and ask questions on Slack.
5. Timeline: Provide a week-by-week plan from now until the final presentation (end of November), including specific milestones for model development, experiments, and report writing.
Submission
Submit as PDF with your GitHub repository link included. Your proposal will be evaluated on problem clarity, literature review quality, dataset appropriateness, timeline realism, and initial implementation quality. You should submit the proposal both on moodle (link above) and on the Slack #project-hub. Feel free to comment and ask questions about each others projects!