The "Which Debts Are Worth The Bank's Effort" project is a data analysis project that involves analyzing a bank's recovery strategies for delinquent debts. The project aims to identify whether the extra effort and cost incurred by the bank at higher recovery strategy levels result in a significant increase in the amount of money recovered from customers. The analysis is based on a dataset of customer information, including demographic data and loan information, which is used to calculate the expected recovery amount for each customer. The project involves data cleaning and preparation, exploratory data analysis, and statistical analysis to determine whether there is a significant discontinuity in the amount recovered at higher recovery strategy levels. The project has practical implications for the bank's collection efforts and can help inform decisions about how to allocate resources for debt collection.
View on Google ColabThe Nobel Prize is perhaps the world's most well known scientific award. Except for the honor, prestige and substantial prize money the recipient also gets a gold medal showing Alfred Nobel (1833 - 1896) who established the prize. Every year it's given to scientists and scholars in the categories chemistry, literature, physics, physiology or medicine, economics, and peace. The first Nobel Prize was handed out in 1901, and at that time the Prize was very Eurocentric and male-focused, but nowadays it's not biased in any way whatsoever. Surely. Right?
View on Google ColabThe "Streamlining Employee Data" project is a data analysis project that involves organizing and merging human resources data for a small business. The project aims to create a structured and unified view of employee data by gathering all available information about a specific employee into a single file. The analysis is based on scattered data from various sources, including Excel files, CSVs, JSON files, and SQL databases. The project involves data cleaning and preparation, merging data from different sources into a pandas DataFrame, and exporting the merged data to a CSV file. The project addresses the challenge of data management in a growing business and provides a foundation for future data-driven decision-making.
View on Google ColabThe project "Optimizing Online Sports Retail Revenue" is a data analysis portfolio that focuses on analyzing product data, revenue, and website traffic for an online sports clothing company. The goal of the analysis is to identify areas where revenue can be improved and provide recommendations to the marketing and sales teams. The analysis includes examining pricing, reviews, descriptions, and ratings of products.
View on Google ColabThe "Medical Insurance - Exploratory Data Analysis Using Python" project is a data analysis project that focuses on medical insurance datasets. The project aims to explore and analyze the data to uncover insights and patterns that can inform future decision-making. The analysis includes descriptive statistics, data visualization, and hypothesis testing to identify relationships between variables such as age, gender, BMI, smoking status, and medical costs. The project provides insights into the factors that drive medical costs and the differences in medical costs between different groups of individuals. The project demonstrates the use of Python for data analysis and provides a foundation for further research and analysis in the field of medical insurance.
View on Google ColabThe "IBM HR Attrition & Performance" project is a data analysis project that utilizes the IBM HR Attrition & Performance datasets from Kaggle. The project aims to explore and analyze the data to identify factors that contribute to employee attrition and to develop strategies to improve employee retention and performance. The dataset includes a wide range of employee-related variables such as age, gender, job role, performance ratings, and job satisfaction scores, which can be analyzed to identify patterns and trends that can inform human resource strategies.
View on Google ColabThe Titanic dataset is a popular dataset used for exploratory data analysis in data science. It contains information about the passengers who were aboard the Titanic, including their demographics, cabin class, fare, and survival status. Exploring the Titanic dataset can help data scientists develop a deeper understanding of how different factors may have affected passenger survival rates on the ill-fated voyage. It's also a great dataset for practicing data analysis skills and learning how to use data visualization tools.
View on Google ColabThis portfolio focuses on COVID-19, the respiratory virus that was first identified in Wuhan, China, in December 2019. The World Health Organization (WHO) declared COVID-19 a pandemic on March 11, 2020, after it had spread across the globe, causing major outbreaks in countries such as Iran, South Korea, and Italy. The virus spreads through respiratory droplets, and governments have implemented country-wide policies such as shutdowns and quarantines to slow the spread. To monitor and learn from the pandemic, organizations such as the Johns Hopkins University Center for Systems Science and Engineering have created publicly available data repositories to consolidate data from sources like the WHO, the Centers for Disease Control and Prevention (CDC), and the Ministry of Health from multiple countries. This notebook visualizes COVID-19 data from the first few weeks of the outbreak to examine at what point the virus became a global pandemic. It's important to note that the data used in this project was pulled on March 17, 2020, and may not reflect the most up-to-date information available.
View on Google ColabLife expectancy at birth is a measure of the average a living being is expected to live. It takes into account several demographic factors like gender, country, or year of birth. Life expectancy at birth can vary along time or between countries because of many causes: the evolution of medicine, the degree of development of countries, or the effect of armed conflicts. Life expectancy varies between gender, as well. The data shows that women live longer that men. Why? Several potential factors, including biological reasons and the theory that women tend to be more health conscious.
View on Google ColabIn this project, we will take a look at data on SATs across public schools in New York City. Every year, American high school students take SATs, which are standardized tests intended to measure literacy, numeracy, and writing skills. There are three sections - reading, math, and writing, each with a maximum score of 800 points. These tests are extremely important for students and colleges, as they play a pivotal role in the admissions process. Analyzing the performance of schools is important for a variety of stakeholders, including policy and education professionals, researchers, government, and even parents considering which school their children should attend.
View on Google ColabThe project "Analyze International Debt Statistics" involves exploring and analyzing a dataset containing information about the amount of debt owed by developing countries across several categories, collected by The World Bank. Through this project, we will find answers to questions like the total amount of debt owed by the countries listed in the dataset, which country owns the maximum amount of debt, and the average amount of debt owed by countries across different debt indicators.
View on Google ColabThe project "Optimizing Online Sports Retail Revenue" is a data analysis portfolio that focuses on analyzing product data, revenue, and website traffic for an online sports clothing company. The goal of the analysis is to identify areas where revenue can be improved and provide recommendations to the marketing and sales teams. The analysis includes examining pricing, reviews, descriptions, and ratings of products.
View on Google ColabThe Degrees That Pay You Back data analysis project aims to explore the short and long-term financial implications of choosing a college major. Using data collected from a year-long survey of 1.2 million people with only a bachelor's degree by PayScale Inc., the project will compare the recommendations from three different methods for determining the optimal number of clusters, apply a k-means clustering analysis, and visualize the results. The project will help individuals in school or navigating the post-grad world evaluate personal interest, difficulty, and career prospects when making this major decision.
View on Google Colab