To achieve full understanding of the use and application of ML algorithms, our participants will work on a real-life industry project, translating theoretical knowledge to practical process and overcoming realistic challenges.
Scope:~400 work hours total
Data:Real data provided by company
Guidance:Experienced mentors provided by Y-DATA
Support:Weekly meetings with company data-owner
Pulmonary Embolism IdentificationBuild an algorithm aimed at detection and classification of PE cases based on a Kaggle freely available data set of chest CTPA images.
Speeding up Transformer-based NLP modelsTrain NLP models based on BERT, ELECTRA, ROBERTA and other models using a GPU, and then experiment with various methods to reduce their complexity and run times on a CPU.
App Domain matchingDevelop matching algorithm that will recommend a match between applications and domains
Full project cycle
The process of working on the project follows popular industry standards and methodologies and incorporates a growing set of tools the students possess to methodically understand and solve a real-world problem. Our students have a full-cycle data science project in their portfolio upon graduation, covering all industry-standard stages: Business Understanding, Data Understanding, Data Preparation, Modeling, Evaluation.
Example ProjectAutomatic detection of low-value queries in technical Q&A forum
A customer operates a forum where programmers ask each other questions, provide answers and rate questions giving them \"ups\" and \"downs\". The forum has a core expert community that provides good answers and valuable insights. However, they often waste their time handling questions of little to no value: marking questions as duplicates and redirecting them, closing topics with incoherent or irrelevant questions etc. Because of this, the overall efficiency of the system suffers.