Resume
Sabil's Resume
Table of contents
General Information
Full Name | Salsabil Maulana Akbar |
Title | Data Scientist @ Tokopedia | NLP & IR |
Education
-
2016 - 2020 Bachelor of Mathematics
Bandung Institute of Technology (ITB) - Focus Area in Applied Mathematics (Optimization, PDE, Financial Mathametics)
- Learnt Core Math Study (i.e Linear Algebra, Real & Complex Analysis, Multivariable Calculus)
- Took intermediate Statistics Courses as well (Probability Theory, Mathematical Statistics)
- Activities
- Took International Modeling Competition on Fish Dyamics over course of time
- Participated in Quantitative Finance (Quant) Portfolio Construction Challenge
Experience
-
2021 (Sep) - current Data Scientist (Search)
Tokopedia -
2020 (Nov) - 2021 (Sep) Data Scientist
Pashouses
Open Source Projects
-
2023 ID ASR Dataset Generator
- A construction of v0 Indonesia language ASR-based (Automatic Subtitle Recognition) Dataset on News Channel videos
- Created end-to-end ETL Pipeline (will be dockerized)
- Used langdetect module for automated subtitle cleansing (inspired from C4 data construction).
- Keywords: ETL Pipeline, Text Processing, NLP Dataset.
-
2023 ID NN-based Splitter
- Forked from this repo WTSplit, previously NNSplit and readjusted its codebase to Indo Data
Academic Interests
-
Advancing NLP latest models to Indonesian & its local languages
- LLM are known to perform poorly in low-resource languages, due to diminishing data volume
- In addition, some resources of local languages in Indonesia aren't being exploited yet in latest NLP advancement
- Hence, I'm contributing on a project (with several masterful folks from Indonesia) in carrying this mission
-
NLP Generic Toolkit Expansion
- Motivation: Some of the most used NLP toolkits isn't available in Indonesia, yet in its local languages
- Hence, I'd like to partake in this mission to democratize the tool & knowledge across languages
-
Low-resource NLP Research
- There are some Low-resource Languages in Indonesia that is syntacticly transferrable w/o significant semantic degradation from Indonesian languages
- Hence, I'm dedicated to devote on bridging NLP into low-resource languages by this approach
- My aspiration also includes attending higher degree education in a environment such that it will enables me on driving this vision
Other Interests
- Cracking Math Problems (at times I'm longing for a challenge for proving/disproving a Math statement)
- Learning to code tidily and in self-explainable manner (since I'm not a CS person during college years, I have to push myself by doing self-learning)
- Doing a truly end-to-end deployment from the scratch, by my own hands (however I may be assisted by Stackoverflow or ChatGPT)
- Discovering Mathamatics beauty as a "language" in real-life domain (be it in DS, ML, and AI or in Social Science)