- 573
- 1 453 484
Brian Byrne
Ireland
Приєднався 6 жов 2013
Lending Club Data Analysis and Machine Learning
Phd Scholarship:
www.linkedin.com/posts/school-of-accounting-economics-finance-technological-university-dublin_a-fully-funded-phd-in-machine-learning-and-activity-7221421763111387138-NGTh?
We examine here loan default prediction using logistic regression, featuring John Hull's renowned materials from the "Machine Learning in Business" textbook.
www-2.rotman.utoronto.ca/~hull/MLThirdEditionFiles/index3rdEd.html
In this video, we explore the practical application of logistic regression in predicting loan defaults, leveraging a toy dataset from Lending Club, as well as tools and techniques from John Hull's Excel and Python resources.
What You'll Learn:
Introduction to Logistic Regression: Understand the basics of logistic regression, a powerful tool used for binary classification problems, and its role in predicting loan defaults.
Data Exploration: Dive into the Lending Club dataset, which includes features such as home ownership, income, debt-to-income ratio, and credit score, with over 12,000 observations.
Model Training: Follow along as we train a logistic regression model using this dataset. We'll split the data into training, validation, and test sets to ensure robust model evaluation.
Performance Metrics: Learn how to evaluate model performance with various metrics such as confusion matrices, ROC curves, and AUC scores. We'll demonstrate how to use Python's sklearn library to generate these metrics, alongside traditional Excel analyses provided by John Hull.
Decision Criteria: Discover how different decision thresholds impact loan acceptance and rejection, and understand the cost-benefit analysis involved in setting these thresholds.
Advanced Insights: Explore the Receiver Operating Characteristic (ROC) curve and Area Under Curve (AUC) as measures of model predictive ability, and see how these metrics guide decision-making in finance.
Hands-On Coding: We include a short Python code segment to illustrate how you can implement logistic regression and evaluate its performance using sklearn, enhancing the original insights with contemporary data science practices.
This video is ideal for finance professionals, data scientists, and anyone interested in applying machine learning techniques to real-world financial datasets. Whether you're studying John Hull's "Machine Learning in Business" or simply looking to expand your knowledge, this tutorial offers valuable insights and practical skills.
Resources:
John Hull's official website and textbook resources: John Hull's Website
Lending Club dataset and logistic regression workbook
Colab link:
colab.research.google.com/drive/1RcQMMXzjnVZXwgrXbg8AoBwrYPh4M7FR?usp=sharing
www.linkedin.com/posts/school-of-accounting-economics-finance-technological-university-dublin_a-fully-funded-phd-in-machine-learning-and-activity-7221421763111387138-NGTh?
We examine here loan default prediction using logistic regression, featuring John Hull's renowned materials from the "Machine Learning in Business" textbook.
www-2.rotman.utoronto.ca/~hull/MLThirdEditionFiles/index3rdEd.html
In this video, we explore the practical application of logistic regression in predicting loan defaults, leveraging a toy dataset from Lending Club, as well as tools and techniques from John Hull's Excel and Python resources.
What You'll Learn:
Introduction to Logistic Regression: Understand the basics of logistic regression, a powerful tool used for binary classification problems, and its role in predicting loan defaults.
Data Exploration: Dive into the Lending Club dataset, which includes features such as home ownership, income, debt-to-income ratio, and credit score, with over 12,000 observations.
Model Training: Follow along as we train a logistic regression model using this dataset. We'll split the data into training, validation, and test sets to ensure robust model evaluation.
Performance Metrics: Learn how to evaluate model performance with various metrics such as confusion matrices, ROC curves, and AUC scores. We'll demonstrate how to use Python's sklearn library to generate these metrics, alongside traditional Excel analyses provided by John Hull.
Decision Criteria: Discover how different decision thresholds impact loan acceptance and rejection, and understand the cost-benefit analysis involved in setting these thresholds.
Advanced Insights: Explore the Receiver Operating Characteristic (ROC) curve and Area Under Curve (AUC) as measures of model predictive ability, and see how these metrics guide decision-making in finance.
Hands-On Coding: We include a short Python code segment to illustrate how you can implement logistic regression and evaluate its performance using sklearn, enhancing the original insights with contemporary data science practices.
This video is ideal for finance professionals, data scientists, and anyone interested in applying machine learning techniques to real-world financial datasets. Whether you're studying John Hull's "Machine Learning in Business" or simply looking to expand your knowledge, this tutorial offers valuable insights and practical skills.
Resources:
John Hull's official website and textbook resources: John Hull's Website
Lending Club dataset and logistic regression workbook
Colab link:
colab.research.google.com/drive/1RcQMMXzjnVZXwgrXbg8AoBwrYPh4M7FR?usp=sharing
Переглядів: 131
Відео
Building an event window into the CAPM using a dummy variable
Переглядів 9014 годин тому
In this Python project, we explore the financial markets using an enhanced version of the Capital Asset Pricing Model (CAPM). Leveraging a GitHub repository from Aldo Dector: github.com/aldodec/ we modify the traditional CAPM to include a dummy variable. This allows us to examine the market impact of significant announcements, such as Apple's decision in February 2024 to retract from the Apple ...
Scrape option chain data using yfinance Python script in Google Colab
Переглядів 7453 місяці тому
Google Colab link: colab.research.google.com/drive/1zuGjwWa4ieEEfetjzsTUYZFh8zmo6xzs?usp=sharing sites.google.com/view/vinegarhill-machinelearninglab/exploratory-data-analysis/automating-data-extraction We start by installing the yfinance library in our Python Colab notebook, which allows us to access financial data from Yahoo Finance. You may have to install the library, using the pip package ...
Ames House Price Dataset from Kaggle opened in Google Colab replete with python notebook
Переглядів 1234 місяці тому
Kaggle Notebook from Lee Clemmer www.kaggle.com/code/leeclemmer/exploratory-data-analysis-of-housing-in-ames-iowa#modelling Google Colab link: sites.google.com/view/vinegarhill-machinelearninglab/exploratory-data-analysis John Hull Webpage for Machine Learning www-2.rotman.utoronto.ca/~hull/MLThirdEditionFiles/mlindex1_3rdEd.html The Ames Housing Dataset, compiled by Dean De Cock in 2011, is wi...
Ames House Price Dataset (INRIA Github deployed using Google Colab)
Переглядів 954 місяці тому
Github links github.com/INRIA github.com/INRIA/scikit-learn-mooc Institut national de recherche en sciences et technologies du numérique (INRIA) www.inria.fr/fr Google Colab link: sites.google.com/view/vinegarhill-machinelearninglab/exploratory-data-analysis John Hull (Machine Learning) www-2.rotman.utoronto.ca/~hull/MLThirdEditionFiles/mlindex1_3rdEd.html The Ames Housing Dataset is a popular ...
Statistics for Data Science using Python (Github resources)
Переглядів 874 місяці тому
Github link to Stephane Dedieu github.com/DrStef/Statistics-for-Data-Science-with-Python Gihub link to exercises github.com/mrankitgupta/Statistics-for-Data-Science-using-Python Coursera www.coursera.org/learn/statistics-for-data-science-python#about Google Colab link sites.google.com/view/vinegarhill-machinelearninglab/home The "Statistics for Data Science with Python" course, often part of IB...
Econometrics to test for Efficient Market Hypothesis using Wooldridge in Google Colab
Переглядів 1074 місяці тому
Python source code for Wooldridge solomonegash.com/woodridge1/index.html Google Colab: sites.google.com/view/vinegarhill-machinelearninglab/home Testing the Efficient Markets Hypothesis (EMH) involves checking if past stock market performance can predict future returns. According to the EMH, you shouldn't be able to use past information to make profitable predictions about future stock prices b...
A python Notebook for Econometric Analysis (Wooldridge)
Переглядів 3254 місяці тому
Please see here original github source: github.com/lystahi/wooldridge_python_notebook/blob/master/Introductory-Econometrics-Examples.ipynb Google Colab link: sites.google.com/view/vinegarhill-machinelearninglab/home Please see also original text: www.cengage.uk/c/introductory-econometrics-a-modern-approach-7e-wooldridge/9781337558860/ Google Books: books.google.ie/books/about/Introductory_Econo...
Developing a Literature Review for Machine Learning in Finance and Economics
Переглядів 1764 місяці тому
Please check link to: sites.google.com/view/vinegarhill-machinelearninglab/home Useful reading for a developing a Literature Review on Machine Learning with Finance Applications: Big Data: New Tricks for Econometrics Hal R. Varian (2014) www.aeaweb.org/articles?id=10.1257/jep.28.2.3 Machine Learning: An Applied Econometric Approach, Sendhil Mullainathan, Jann Spiess (2017) www.aeaweb.org/articl...
ISLP a key resource for practitioners of Robotic Process Automation
Переглядів 33411 місяців тому
ISLP (An Introduction to Statistical Learning with Applications in Python): Authors: Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani Links covered in video: www.statlearning.com/ github.com/intro-stat-learning/ISLP_labs/blob/main/Ch04-classification-lab.ipynb github.com/aldodec/Machine-Learning-in-Financial-Markets github.com/junyanyao/ISLR_Python/blob/master/Ch4 Classificati...
Automating Bankruptcy Prediction using Financial Accounting Ratios in Google Colab
Переглядів 1,3 тис.11 місяців тому
Automating Bankruptcy Prediction using Financial Ratios - Dataset and Code Please find Google Colab here: colab.research.google.com/drive/1I52DmWEASwhyaCH7WucILRCkACTJdd9u?usp=sharing Here we set up the process to automate bankruptcy prediction using financial ratios! In this tutorial, we use financial data analysis and machine learning (logistic regression) to predict bankruptcy risk in compan...
The JP Morgan Chase Python Training Course: yield curve visualization with chatgpt copilot
Переглядів 63411 місяців тому
JPMorgan Chase Python Training for Financial Analysis and Automation: github.com/jpmorganchase/python-training/tree/main Link to Google Colab: colab.research.google.com/drive/1MIxn_j6zIwfbe0OuhbkHLQk0AlZfK1sL?usp=sharing The JPMorgan Chase Python training course serves as a vital resource for mastering core areas of financial analysis and automating standard computations in accounting and finan...
Automating standard computations in Finance and Accounting using code posted on Github
Переглядів 52511 місяців тому
Please obtain code from Aldo Dector's Github: github.com/aldodec Link to Updated Google Colab with modifications for library deprecation: colab.research.google.com/drive/1HfQqMZ2xVz8wYa72Nli9gGJfWzK2Y5C3?usp=sharing GitHub as a Source of Python Code that can be employed directly in Google Colab GitHub is a widely-used platform for hosting and sharing code projects, including those written in Py...
The JP Morgan Chase Python Training Course: a Straddle navigated with the chatgpt copilot
Переглядів 43311 місяців тому
Using ChatGPT to Understand Complex Python Code on GitHub JP Morgan Chase straddle github: github.com/jpmorganchase/python-training/blob/main/notebooks/2_straddle.ipynb Please find Google Colab here: colab.research.google.com/drive/1BXiKIEjiG-Q-QvDsAtSK5DZa-wnslvVG?usp=sharing GitHub repositories often contain code that can be complex and challenging to understand, especially for newcomers or t...
Automating standard computations in Finance and Accounting with python code from the Kaggle Portal
Переглядів 14811 місяців тому
Link to Code Artefact: www.kaggle.com/code/carlolepelaars/introduction-to-financial-mathematics Using Kaggle as a Resource for Code Artefact Development: Kaggle is a widely recognized platform for data science and machine learning competitions, collaboration, and learning. As a resource for developing code artefacts, Kaggle offers a plethora of datasets, notebooks, and kernels shared by the com...
The JP Morgan Chase Python Training Course
Переглядів 1 тис.11 місяців тому
The JP Morgan Chase Python Training Course
Automation of Bond Pricing and Mortgage Repayment Scheduling with Python
Переглядів 23611 місяців тому
Automation of Bond Pricing and Mortgage Repayment Scheduling with Python
Automation of Annuity Time Values using Python: a used case in RPA optimization
Переглядів 16111 місяців тому
Automation of Annuity Time Values using Python: a used case in RPA optimization
Automation of standard Financial and Accounting computations using the python library numpy
Переглядів 23211 місяців тому
Automation of standard Financial and Accounting computations using the python library numpy
Google Colab Charts: Matplotlib: Subplotting using subplot2grid, 3D Scatter Plots, Altair vs. Plotly
Переглядів 1 тис.11 місяців тому
Google Colab Charts: Matplotlib: Subplotting using subplot2grid, 3D Scatter Plots, Altair vs. Plotly
Google Colab Charts: Line Plots, Histograms, Bar plots, Scatter Plots, Pie and Stack Charts
Переглядів 3,5 тис.11 місяців тому
Google Colab Charts: Line Plots, Histograms, Bar plots, Scatter Plots, Pie and Stack Charts
The Financial Times Bank Fines Dataset (2007 - 2015): automating pivot tables and visualization
Переглядів 11611 місяців тому
The Financial Times Bank Fines Dataset (2007 - 2015): automating pivot tables and visualization
The FT Bank Fines Dataset (2007 - 2015): Web Scraping with Pandas, Beautiful Soup and Requests
Переглядів 10311 місяців тому
The FT Bank Fines Dataset (2007 - 2015): Web Scraping with Pandas, Beautiful Soup and Requests
Financial Data Transformation with Python Automation - Google Colab
Переглядів 29411 місяців тому
Financial Data Transformation with Python Automation - Google Colab
Visualization using pandas, matplotlib, seaborn, altair and google colab
Переглядів 31411 місяців тому
Visualization using pandas, matplotlib, seaborn, altair and google colab
The Financial Times Bank Fines Dataset (2007 - 2015)
Переглядів 1,2 тис.11 місяців тому
The Financial Times Bank Fines Dataset (2007 - 2015)
Data Extraction Automation: pandas for Financial Analysis
Переглядів 6612 роки тому
Data Extraction Automation: pandas for Financial Analysis
data manipulation for palmerpenguins using pandas
Переглядів 1,3 тис.2 роки тому
data manipulation for palmerpenguins using pandas
ggplot2 visualization of the palmerpenguins dataset
Переглядів 1,3 тис.2 роки тому
ggplot2 visualization of the palmerpenguins dataset
Data transformation using the tidyverse dplyr package and the palmer penguin dataset in Google Colab
Переглядів 8182 роки тому
Data transformation using the tidyverse dplyr package and the palmer penguin dataset in Google Colab
Thank you for sharing. This has been very helpful !
Hi Brian. Thank you for making this video available to us. Are you aware of any library that does this computation in vectorized format, i.e. for multiple stocks simultaneously?
No but in the Espen Haug text book there is a monte carlo approach used to model the value of a financial contract based on multiple underlying. This would be not be a lattice approach.
Thank you for letting me know. I had glanced at Espen Haug's book a while back. Seemed quite useful then.
how about scrab data from page directly instead of using yfinace?
Could use the source code for yfinance if you wanted a bit more control
Thank you for posting this Brian! Does this work for an arbitrary date? For example September 29, 2008?
No - only for contemporaneous data.
Dónde puedo descargar el dataset??
Thanx
❤india
ممنونم خیلی خوب و آموزنده بود
I had no clue you could use Python scripts in Google
It seems to be one of the better python notebooks
Hi Brian, amazing video, but an error shows saying no module named wooldridge in line 5
You may need to pip install wooldridge
Have you used panda profiling?
pip install pandas-profiling thank you for suggestion
Can you please share this dataset.
www-2.rotman.utoronto.ca/~hull/ofod/index.html. Please check this website
@@BrianByrneFinance Thanks a ton for sharing this!
can i get the code sir?
sites.google.com/view/vinegarhill-financelabs/binomial-lattice-framework/tian-1993
Check a bit down the portal page
All videos are useful.
too informative sir -Abuzar
how to do it without using volatilty? Putting zero is an error obviously
The model needs a volatility input but you could use 0.00001 perhaps
Thank so much, professor. can you help me write command for extract result of GARCH in R to csv???
I was looking for that kind of content all day long. I am happy that I found you channel.
Glad you enjoy it!
Hello Mr. Brian, i appreciate your work, but i have to ask, in this line where you calculated d1 : EuroPrice = CBlackScholes::BSPrice(Spot,K,r,q,v,T,'P'); d1 = (log(Sx/K) + (r-q+v*v/2.0))/v/sqrt(T); why havent you multiplied (r - q + 0.5v²) by T as in the original paper of Barone Adesi and Whaley ?
If you look closer you do find that formula works out the same. You divide and divide again which is like multiplying.
@@BrianByrneFinance makes sense, thank you
Hii I wanted to create a same model based on indian dataset which I have create, I wanted to compare Random forest, knn, xgboost and lstm and I dont have any reference for creating that can you help
You might try to use chatgpt to write the code. Feed it a sample of your data with labels and ask it to generate code for classification.
@@BrianByrneFinance will try that, And also do we have to normalize the data with the code or manually in the dataset
yes normalize@@gds231
@@BrianByrneFinance how??
subtract mean and divide by standard deviation or use sklearn@@gds231
where we used vba?
sites.google.com/view/vinegarhill-financelabs/monte-carlo
The best lecturer ever! Thanks for sharing
My Vianace , is different , i use nominal numbers for calculations , and not percentage - its a problem?
correct yes you should use returns and not prices
love the Cython video! do you got any plan on making more videos about Cython implementations? I would love to see more like this. Great content, cheers!
I have to give input to this model so that it predicts whether the company is bankrupt or not. how do i do it? can you please help...?
You could use joblib or pickle
@@BrianByrneFinance ya. But what code to write. Could you provide any reference video please.
www.kaggle.com/datasets/fedesoriano/company-bankruptcy-prediction@@harshgawali5154
@@BrianByrneFinance this is the dataset. I want code to give input to the model. So that it gives ans whether the company will become bankrupt or not
decent explanation, but not much conceptually gained. Still helpful tho cuz i understand the walkthrough of the calculation now.
outstanding
So is forward float rate and Sk or strick is fixed rate Correct me if m wrong?
Sk is always fixed
So=6.02 is a forward at 5th year or for the 5+3= 8 year
The term structure is assumed flat.
Thank you a million times!
Thank you very much for sharing your work, you have helped me so much, I have no words to say how grateful I am. greetings. :)
Glad it helped.
Hello Brian, What would you do if you wanted to have the probability of default for 3 years from now?
hello. Great video. I have a question about the outliers. What can we do about them?
Some people winsorize please chatgpt link chat.openai.com/share/36a798a3-d409-4014-8467-da17bc9a2fc9
hello , thanks for the video please how can I get the book its very intersting
John C Hull Opions Futures and other Derivatives
thank for this video, it is very helpful
Where can feed our yield_curve data with live data? Ideally with the same data format
FRED DATABASE
FRED St louis
@@BrianByrneFinance thanks, also, it seems like the 30 years maturity is not reflected correctly. It supposed to start on 1990 while you can clearly see that there is no curve for this period at the further end of the 3D surface
I guess this issue rises from the way the code link trace 1 and trace 2. It works only if only both extremes have 'None' value, but it doesn't work if the maturity which has 'None' value is in-between two maturities. Eg: 20y is between 10y and 30y --> 30y isn't shown
Try taking the code and put into chatgpt. Describe the problem as you best can and then see if chatgpt can suggest a fix. Request a python solution.
Thank you, very informative!
Hi Brian . My name is Ilker. I am writing from Turkey. Can I Ask a question to you? Can we apply this model within companies? (Merton Structural Model). Thanks.
Yes, and that is common. It is always worth noting that the model makes assumptions and so long as these are plausible for a given context then the model has validity.
why is p = 1 - d / u - d ; should it be e^rt - u / u -d ?
But it is for a futures underlying
Good work sir🙌🏻
Great explanation, thank you for the help.
Promo_SM 🙃
Font size too small..can't see anything on a phone screen. Great content though! Liked and subscribed!
Important to take account of phone audience. I will keep that in mind.
@@BrianByrneFinance crazy speedup from Cython 🤯 Need to pickup speed on my C learning!
Brian, I appreciate your content on python and the JPM course in particular. I would love to connect with you, I'm a beginner in the space and would like to hear from you on what are the best sources to learn python in a finance context
Possibly islr python from statical learning comes complete with text videos and GitHub
I took the Stanford course years ago in R but I'm a python user (in finance, NYC) so this is an excellent updated resource. I'm recommending this to some college kids in my extended family. Very nice summary, thank you.
I am very impressed how the barriers to entry have really come down in recent times. Top quality educational resource with no major outlay to access computing resources, no requirement for an expensive computing lab, available 24/7….
hello sir, why my c++ doesnt have any colour?. Only white and blue for %%writefile main.cpp and %%script bash
It just runs the script. Possibly later as colab evolves they may offer that detail.
Thank you so much, Brian; I just started my study for the CFA L2. For 2024, the CFA added some practice modules related to Python. This will be helpful.
Good luck with those exams
Glad you found this valuable. Let me know if you have any feedback. Would love to see some pull requests with suggested adds and edits too!
Thanks Rob, I plan to run this content through a couple of Masters programmes in Dublin. Any insights - I would be glad to pass along. best, Brian@@runderwood5
Thanks for your video sir.. Can you give me R code Heston model options pricing 🙏