I am a reflective programmer and graduate student with experience in
I am a highly motivated and progress-focused student currently pursuing Master of Science in Information Systems at Northeastern University and have a strong theoretical basis for computer science and data analytics.
I have a industry experience of 4 years as a Data Engineer at State Street Corporation and have worked on many big data technologies (Hadoop, Spark, HDFS, M/R, Hive, Pig, and Hue) with cross-platform proficiency (Windows, Unix) and strong programming skills (Python, Java, Scala, and Shell scripting) having profound knowledge on data infrastructure and frameworks. My job responsibilities primarily included the building of data pipelines for data cleansing and data migration process from different State Street data sources to the Hadoop environment and analyzing the data to generate the Hive reports according to the business requirements by maintaining quality assurance. Within a span of just 9 months, I have worked on 3 projects simultaneously, but my greatest achievement was to design and develop an automated solution that migrates data from various servers (2000 servers approx.) of different database platforms like Oracle, MS SQL, and Sybase to Hadoop environment and apply transformation techniques to generate reports according to the client’s requirement thereby eliminating the manual intervention.
• Teach core data science concepts, conduct demos on key tools, and simplify complex topics for students.
• Grade assignments, provide feedback, and facilitate class discussions to promote engagement and critical thinking.
• Hold office hours for personalized student support, while also assisting with course administration and mentoring.
• Implemented a key independent project using Python in Databricks to automatically identify and remove over 200 redundant Snowflake databases, leading to cost savings.
• Migrated DAGs from Airflow 1 to Airflow 2, ensuring minimal disruption to workflows.
• Conducted workshops and led classes in data visualization.
• Addressed student queries, fostering a supportive learning environment.
• Managed grading responsibilities and handled administrative tasks.
• Developed automated data processing systems, reducing manual assistance by 6 man-hours per week.
• Improved process efficiency by 15% through automated data pipelines.
• Led a team to build multi-source data extraction pipelines using StreamSets.
• Created robust ETL workflows that increased workflow efficiency by 30%.