Thursday, August 6, 2020

PYTHON CODES

PYTHON CODES

A statement or expression is an instruction the computer will run or execute.

The value in the parentheses is called the argument.

A Semantic error is when your logic is wrong.

print("Hello\nWorld!")

Hello

World!

Expressions describe a type of operation that computers perform.

`We can treat the string as a sequence and perform sequence operations.

We can also input a stride value as follows. The 2 indicates we select every second variable.

MODULE 2

lists and tuples, these are called compound data types

Tuples Tuples are an ordered sequence.

Saturday, August 1, 2020

Almost every data scientist will spend time working in a database, which is an organized collection of structured data in a computer system. (Remember, structured data is usually organized in a table format with rows and columns, like the following example.)

Last four digits of social security number	Last name	Age
6881	Marshall	23
0121	Rodriguez	19
5538	Cho	59
2972	Parker	33
3154	Sawyer	72

Most databases today are organized as relational databases, which are collections of multiple data sets or tables that link together.

While SQL is the underlying language that drives most work done in relational databases, there are many RDBMSs in which you can do that work. As you venture into this field, you’ll run into names like these:

MySQL
Microsoft Access
PostgreSQL
Oracle
IBM DB2
MongoDB

Choose the right tools to manage data

Where do you begin? There are dozens of useful data science tools and platforms! Here’s a list of some popular and open source platforms that you can use to begin your own data science journey.

R is a good place to start

R is a programming language and free software environment often used for statistical analysis and data science. Many would-be data scientists start with this tool or with one of the popular R interfaces, and there are hundreds of useful packages in R that help with data visualization such as ggplot2.

Python works for general purposes

Python is a popular, general-purpose programming language that can also be used for data science. Pair it with a library like pandas library and with a useful interface, and Python can help you create new insights and data visualizations.

MATLAB helps crunch numbers

MATLAB was built to focus on numerical computing. It is often used in higher education.

Apache Spark supports big data and machine learning

Apache Spark is a proprietary general-purpose framework that can be especially useful for extremely large data sets and the machine learning that uses them.

Wednesday, July 15, 2020

POWER BI PROFESSIONAL TRAINER PROFILE

https://diceanalytics.pk/school/courses-and-workshops/power-bi-workshop-live/?utm_source=FB%20Sponsored&utm_medium=ISB%20Banner&utm_campaign=LIVE%20|%20Power%20BI%20Workshop%2002%20-%2025th%20July%2720&fbclid=IwAR0XYtIkIGJfIuSBChMECQ75DVbA7vmL46i2qCmCqi4ZClgc3wMBLHCjMJY#view-bg-content

https://www.linkedin.com/in/nabeel-wyne-2b98723a/?originalSubdomain=pk

Tuesday, July 14, 2020

CODES

Data Cleaning and Blending

https://towardsdatascience.com/visualizing-covid-19-spread-with-tableau-animations-75890dda23bb

The NYT dataset doesn’t include information about county population, so I’m going to merge the two datasets into one using Python and pd.merge().

merge in Pandas
by using a SQL JOIN or
a VLOOKUP in Excel

Before we do that, we first need to clean the data

check the Github repo here.

Monday, July 13, 2020

Support Vector Machines

Support Vector Machines, Clearly Explained!!!

Plot an ROC Curve in Python

How to Plot an ROC Curve in Python | Machine Learning in Python

OC and AUC, Clearly Explained!

ROC and AUC in R

machine learning and data science on cloud

run machine learning and data science on cloud using high processing GPUs at no COST

https://colab.research.google.com/

How to use R with Google Colaboratory?

https://colab.research.google.com/drive/1xj_aYLBBPX2oSQ1I4xp5_YZiVhhpC1Ke

Google Colaboratory supports Python version 2.7 and 3.6

I see an example how to use Swift in Colab a while ago

# Please try the newer version here:

https://colab.research.google.com/drive/1BYnnbqeyZAlYnxR9IHC8tpW07EpDeyKR

or Kaggle R jupyter notebook which supports R and Rstan by default:

https://www.kaggle.com/thimac/rstan?scriptVersionId=20867095

How to use R and Python in same notebook on Google Colab

How to Build Your First Data Science Web App in Python (Streamlit Tutorial Part 1)

Sunday, July 12, 2020

Event Studies.py

EventStudies.py

http://esocialtrader.com/event-studies/

I do have a project going on and wanted to check if you can help me with any of my challenges.

Project: Comparative Analysis on stock returns around M&A announcements.

I need an expert / consultant in Data Analysis and Econometrics with access to professional databases and knowledge possibly in Python, R, Stata or similar tools to conduct an event study with multiple event windows done on a large dataset.

Vision:

I have identified a list of ~17k transactions (the sample) that fulfil the selection criteria. I intend to conduct an event study in order to find abnormal returns for acquirer, target and both combined. The results shall then be presented.

My challenges:

- Reducing the sample? Yes or No?

- Identifying the right indices or basket of comparable (industry, geography, liquid) stock as a proxy of the market portfolio to regress against.

- Sourcing the data for the large amount of transactions (acquirer, target, market portfolio.

- Cleaning the data and making sure that for each transaction there are an equal amount of observations.

- Estimate normal returns based on respective market portfolio chosen for the specific event

- Cumulative Abnormal returns for all Acquirers, Targets and both combined (Whole Sample)

- Testing for significance

- Dividing the data into two cohorts based on one simple selection criteria

- Cumulative Abnormal returns for all Acquirers, Targets and both combined (Cohort 1 & Cohort 2)

- Testing for significance

Requirements:

• The candidate must have proven knowledge and understanding of conducting event studies, data anlysis and econometrics.

• The candidate should have a good understanding of the academic literature surrounding event studies.

• The candidate must have access to professional databases like Bloomberg, Datastream, CapitalIQ or comparable.

Expectations:

- A solution that requires as little manual intervention as possible and can be reused with a different data set.

- Support in word and deed and act as a consultant.

- Model documentation and validation including relevant tables and graphs and descriptive statistics

- Source code, if any

Specifications:

Budget: 250 doller, Delivery time: 7 days (Jul. 16 2020)

Tool: Python

Data Science