🌐 UniTrends: My First Data Engineering Project — US F1 VISA Trends.
Using Telegram API, Kafka, and AWS tools, I analyzed ‘US F1 VISA’ group chats, refining my YouTube content strategy and gaining 10k subscribers through data-driven insights.
🌟 Embarking my Data Engineering Journey
As someone new to the realm of data engineering, every step I take feels like both an exploration and a revelation. Venturing into this field, my aim was not just to learn, but to create something meaningful right from the outset. The “US F1 VISA experiences” group on Telegram presented just the right opportunity. Through this project, I hoped to delve deep into the world of data, derive insights, and use those insights to fuel my YouTube channel. The outcome? A significant boost in viewership and an even greater boost in confidence. Join me as I recount this fascinating journey, from the initial challenges to the rewarding outcomes.
🧠 The Learning Landscape
The world of data engineering is expansive. Tools like Kafka, AWS Glue, and Amazon Athena, once seeming enigmatic, gradually unveiled their functionalities, proving instrumental to the project's architecture.
🔧 Project Implementation:
You can access my complete code here:
I followed this YouTube tutorial by Darshil Parmar.
a. Data Extraction from Telegram:
The `Teletest.py` script, powered by Python, delved into the Telegram chats, sifting through raw data to prepare it for analysis.
b. Real-time Data Streaming with Kafka:
Kafka, with its prowess in real-time data streaming, came to the fore. The `producer.ipynb` and `consumer.ipynb` notebooks orchestrated the data flow, channeling it towards Amazon S3.
c. Data Storage in Amazon S3:
Amazon S3 stood out for its reliability, ensuring the data's safety and accessibility.
d. Structuring with AWS Glue:
AWS Glue played a pivotal role in organizing the data, making it structured and query-ready.
e. Deep Dive Analysis using Amazon Athena:
Harnessing Amazon Athena, the data was meticulously analyzed, revealing patterns and trends that informed content creation.
🌟 Triumphs, Challenges, and Revelations
The journey was filled with highs and lows. The eureka moments when code executed flawlessly, the challenges in tool integration, and the invaluable insights all contributed to a richer learning experience and tangible outcomes.
🌱 Reflections and Forward-Look
The journey into data engineering has just begun. With myriad opportunities ahead, the excitement is palpable. For fellow data aficionados, the invitation is open: explore, innovate, and share. Dive deeper with the GitHub repository and engage on my YouTube channel.
👋 About me:
Snehit Vaddi here!
Pursuing Masters in CS at University of Florida. I’m a ML enthusiast and Data freak. Teaching and learning go hand-in-hand for me, fueling my tech journey. Oh, and by the way, I’m on the lookout for some exciting Summer ’24 internships in the US. Let’s connect and collaborate!
🤝Linkedin: https://www.linkedin.com/in/snehitvaddi/
👩💻Github: https://github.com/snehitvaddi