Data Analysis is to understand problems facing an organization and to explore data in meaningful ways. Data in itself is merely facts and figures. Evaluation of the data can provide advantages to the organization and aid in making business decisions.
Brief Overview of the components Apache Spark is a lightning-fast cluster computing technology, designed for fast computation and based on Hadoop MapReduce.
Pandas is a software library written in Python for data manipulation and analysis.
I hail from Himalayas of Nepal who grew up playing alongside Yeti,
the mythical beast.
I recall the start of my love and passion for coding since mid-2000 when
I first got introduced to QBasic. At present, I am passionate about various engineering domains,
product, and developer relations. I can wear different hats as needed.
In my spare time, I relish to go out for trekking, listen to acoustic music, play the guitar, and
contribute to Wikipedia.