Data Scientist – Machine Learning
Location: Bangalore (location is flexible under the COVID situation)
To apply, please email your resume.
Akridata is a US based startup founded in 2018 to build an edge data platform for the autonomous world.
Rich data in large volumes is being collected at the edge(outside a data center) in use cases like autonomous vehicles, smart manufacturing, satellite imagery, smart retail, smart agriculture etc.
These datasets are characterized by being unstructured (images/videos), large size (Petabytes per month), distributed (across edge, on-prem and cloud) and form the input for training AI models to get to higher degrees of automation.
Akridata is engaged with building products that solve these unique challenges and be at the forefront of this edge data revolution.
The company is backed by prominent VCs and has it’s entire software engineering team based out of India and provides ample opportunities for from-scratch design and development.
This is an individual contributor role in the data science team, focusing on developing an in-house machine learning library in Python. This library is aimed at driving the summarization of complex data, and simplifying data exploration and sampling of large subsets of such data. Building such a library involves a range of activities spanning research, development, validation, testing, and documentation.
If you are early in your career, this role is a great way to become a member of the data science team at a promising VC-backed company. You will be:
- exposed to state-of-the-art machine learning research, which can sometimes involve building and publishing new techniques.
- able to learn and see the true performance envelope of the most advanced machine learning algorithms, and gain an appreciation of the trade-offs involved.
- challenged to explore different ways of manipulating complex data over reasonable timelines, and under adequate mentoring.
- part of a positive culture with a good work-life balance, and collaborate with small teams of developers who bring your work into production.
- Sourcing relevant datasets to evaluate our library.
- Benchmarking against existing machine learning libraries.
- Reporting performance results in a concise manner.
- Ensuring consistent documentation and tests.
Research and development activities will be assigned on a lower priority.
This role is heavily biased towards maths and statistics, and requires learning a wide range of mathematical techniques over short timelines.
- Ability to produce clean Python code, including abstract classes and modules, to handle data manipulation. Familiarity with Jupyter notebooks.
- 1+ years of experience in machine learning, or 3+ years of experience in data analysis.
- A bachelor’s degree in Computer Science, Physics, Engineering, Mathematics, or another relevant quantitative field. Additional work experience is required for non-CS/IT graduates.
- Understanding of the mathematical theory behind machine learning techniques for dimensionality reduction, clustering, classification, regression, and deep neural networks.
- Familiarity with object oriented programming and version control systems (Git)
- Hands-on experience in tuning, validating, and benchmarking ML models, and reporting the results to a technical audience.
- Experience in coding up simple ML algorithms from scratch (using numpy/scipy).
Good to have
In addition to the above necessary skills, expertise and experience in one or more of the following will be advantageous:
- Advanced expertise in the above-mentioned necessary skills.
- A portfolio of personal machine learning projects, published on github or similar.
- A post-graduate research degree in Computer Science, Physics, Engineering, Mathematics, or another relevant quantitative field, with relevant research publications.
- Domain expertise in computer vision and/or time-series analysis of large multi-dimensional signals.
- Writing comprehensive tests for machine learning algorithms, covering common edge cases.
- Improving computational performance of Python code using Cython, JIT, multithreading and/or GPUs.
- Experience with application agile and iterative development practices.