Close Menu
SkytikSkytik

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    At Least 32 People Dead After a Mine Bridge Collapsed Due to Overcrowding

    November 17, 2025

    Here’s how I turned a Raspberry Pi into an in-car media server

    November 17, 2025

    Beloved SF cat’s death fuels Waymo criticism

    November 17, 2025
    Facebook X (Twitter) Instagram
    • About Us
    • Contact Us
    SkytikSkytik
    • Home
    • AI Tools
    • Online Tools
    • Tech News
    • Guides
    • Reviews
    • SEO & Marketing
    • Social Media Tools
    SkytikSkytik
    Home»AI Tools»Why Is My Code So Slow? A Guide to Py-Spy Python Profiling
    AI Tools

    Why Is My Code So Slow? A Guide to Py-Spy Python Profiling

    AwaisBy AwaisFebruary 5, 2026No Comments10 Mins Read0 Views
    Facebook Twitter Pinterest LinkedIn Telegram Tumblr Email
    Why Is My Code So Slow? A Guide to Py-Spy Python Profiling
    Share
    Facebook Twitter LinkedIn Pinterest Email

    frustrating issues to debug in data science code aren’t syntax errors or logical mistakes. Rather, they come from code that does exactly what it is supposed to do, but takes its sweet time doing it.

    Functional but inefficient code can be a massive bottleneck in a data science workflow. In this article, I will provide a brief introduction and walk-through of py-spy, a powerful tool designed to profile your Python code. It can pinpoint exactly where your program is spending the most time so inefficiencies can be identified and corrected.

    Example Problem

    Let’s set up a simple research question to write some code for:

    “For all flights going between US states and territories, which departing airport has the longest flights on average?”

    Below is a simple Python script to answer this research question, using data retrieved from the Bureau of Transportation Statistics (BTS). The dataset consists of data from every flight within US states and territories between January and June of 2025 with information on the origin and destination airports. It is approximately 3.5 million rows.

    It calculates the Haversine Distance — the shortest distance between two points on a sphere — for each flight. Then, it groups the results by departing airport to find the average distance and reports the top five.

    import pandas as pd  
    import math  
    import time  
      
      
    def haversine(lat_1, lon_1, lat_2, lon_2):  
        """Calculate the Haversine Distance between two latitude and longitude points"""  
        lat_1_rad = math.radians(lat_1)  
        lon_1_rad = math.radians(lon_1)  
        lat_2_rad = math.radians(lat_2)  
        lon_2_rad = math.radians(lon_2)  
      
        delta_lat = lat_2_rad - lat_1_rad  
        delta_lon = lon_2_rad - lon_1_rad  
      
        R = 6371  # Radius of the earth in km  
      
        return 2*R*math.asin(math.sqrt(math.sin(delta_lat/2)**2 + math.cos(lat_1_rad)*math.cos(lat_2_rad)*(math.sin(delta_lon/2))**2))  
      
      
    if __name__ == '__main__':  
        # Load in flight data to a dataframe  
        flight_data_file = r"./data/2025_flight_data.csv"  
        flights_df = pd.read_csv(flight_data_file)  
      
        # Start timer to see how long analysis takes  
        start = time.time()  
      
        # Calculate the haversine distance between each flight's start and end airport  
        haversine_dists = []  
        for i, row in flights_df.iterrows():  
            haversine_dists.append(haversine(lat_1=row["LATITUDE_ORIGIN"],  
                                             lon_1=row["LONGITUDE_ORIGIN"],  
                                             lat_2=row["LATITUDE_DEST"],  
                                             lon_2=row["LONGITUDE_DEST"]))  
      
        flights_df["Distance"] = haversine_dists  
      
        # Get result by grouping by origin airport, taking the average flight distance and      printing the top 5  
        result = (  
            flights_df  
            .groupby('DISPLAY_AIRPORT_NAME_ORIGIN').agg(avg_dist=('Distance', 'mean'))  
            .sort_values('avg_dist', ascending=False)  
        )  
      
        print(result.head(5))  
      
        # End timer and print analysis time  
        end = time.time()  
        print(f"Took {end - start} s")

    Running this code gives the following output:

                                            avg_dist
    DISPLAY_AIRPORT_NAME_ORIGIN                     
    Pago Pago International              4202.493567
    Guam International                   3142.363005
    Luis Munoz Marin International       2386.141780
    Ted Stevens Anchorage International  2246.530036
    Daniel K Inouye International        2211.857407
    Took 169.8935534954071 s

    These results make sense, as the airports listed are in American Samoa, Guam, Puerto Rico, Alaska, and Hawaii, respectively. These are all locations outside of the contiguous United States where one would expect long average flight distances.

    The problem here isn’t the results — which are valid — but the execution time: almost three minutes! While three minutes might be tolerable for a one-off run, it becomes a productivity killer during development. Imagine this as part of a longer data pipeline. Every time a parameter is tweaked, a bug is fixed, or a cell is re-run, you are forced to sit idle while the program runs. That friction breaks your flow and turns a quick analysis into an all-afternoon affair.

    Now let’s see how py-spy can help us diagnose exactly what lines are taking so long.

    What Is Py-Spy?

    To understand what py-spy is doing and the benefits of using it, it helps to compare py-spy to the built-in Python profiler cProfile.

    • cProfile: This is a Tracing Profiler, working similar to a stopwatch on each function call. The time between each function call and return is measured and reported. While highly accurate, this adds significant overhead, as the profiler has to constantly pause and record data, which can slow down the script significantly.
    • py-spy: This is a Sampling Profiler, working similar to a high speed camera looking at the whole program at once. py-spy sits completely outside the running Python script and takes high-frequency snapshots of the program’s state. It looks at the entire “Call Stack” to see exactly what line of code is being run and what function called it, all the way up to the top level.

    Running Py-spy

    In order to run py-spy on a Python script, the py-spy library must be installed in the Python environment.

    pip install py-spy

    Once the py-spy library is installed, our script can be profiled by running the following command in the terminal:

    py-spy record -o profile.svg -r 100 -- python main.py

    Here is what each part of this command is actually doing:

    • py-spy: Calls the tool.
    • record: This tells py-spy to use its “record” mode, which will continuously monitor the program while it runs and saves the data.
    • -o profile.svg: This specifies the output filename and format, telling it to output the results as an SVG file called profile.svg.
    • -r 100: This specifies the sampling rate, setting it to 100 times per second. This means that py-spy will check what the program is doing 100 times per second.
    • --: This separates the py-spy command from the Python script command. It tells py-spy that everything following this flag is the command to run, not arguments for py-spy itself.
    • python main.py: This is the command to run the Python script to be profiled with py-spy, in this case running main.py.

    Note: If running on Linux, sudo privileges are often a requirement for running py-spy, for security reasons.

    After this command is finished running, an output file profile.svg will appear which will allow us to dig deeper into what parts of the code are taking the longest.

    Py-spy Output

    Icicle Graph output from py-spy

    Opening up the output profile.svg reveals the visualization that py-spy has created for how much time our program spent in different parts of the code. This is known as a Icicle Graph (or sometimes a Flame Graph if the y-axis is inverted) and is interpreted as follows:

    • Bars: Each colored bar represents a particular function that was called during the execution of the program.
    • X-axis (Population): The horizontal axis represents the collection of all samples taken during the profiling. They are grouped so that the width of a particular bar represents the proportion of the total samples that the program was in the function represented by that bar. Note: This is not a timeline; the ordering does not represent when the function was called, only the total volume of time spent.
    • Y-axis (Stack Depth): The vertical axis represents the call stack. The top bar labeled “all” represents the entire program, and the bars below it represent functions called from “all”. This continues down recursively with each bar broken down into the functions that were called during its execution. The very bottom bar shows the function that was actually running on the CPU when the sample was taken.

    Interacting with the Graph

    While the image above is static, the actual .svg file generated by py-spy is fully interactive. When you open it in a web browser, you can:

    • Search (Ctrl+F): Highlight specific functions to see where they appear in the stack.
    • Zoom: Click on any bar to zoom in on that specific function and its children, allowing you to isolate complex parts of the call stack.
    • Hover: Hovering over any bar displays the specific function name, file path, line number, and the exact percentage of time it consumed.

    The most critical rule for reading the icicle graph is simply: The wider the bar, the more frequent the function. If a function bar spans 50% of the graph’s width, it means that the program was working on executing that function for 50% of the total runtime.

    Diagnosis

    From the icicle graph above, we can see that the bar representing the Pandas iterrows() function is noticeably wide. Hovering over that bar when viewing the profile.svg file reveals that the true proportion for this function was 68.36%. So over 2/3 of the runtime was spent in the iterrows() function. Intuitively this bottleneck makes sense, as iterrows() creates a Pandas Series object for every single row in the loop, causing massive overhead. This reveals a clear target to try and optimize the runtime of the script.

    Optimizing The Script

    The clearest path to optimize this script based on what was learned from py-spy is to stop using iterrows() to loop over every row to calculate that haversine distance. Instead, it should be replaced with a vectorized calculation using NumPy that will do the calculation for every row with just one function call. So the changes to be made are:

    • Rewrite the haversine() function to use vectorized and efficient C-level NumPy operations that allow whole arrays to be passed in rather than one set of coordinates at a time.
    • Replace the iterrows() loop with a single call to this newly vectorized haversine() function.
    import pandas as pd  
    import numpy as np  
    import time  
      
      
    def haversine(lat_1, lon_1, lat_2, lon_2):  
        """Calculate the Haversine Distance between two latitude and longitude points"""  
        lat_1_rad = np.radians(lat_1)  
        lon_1_rad = np.radians(lon_1)  
        lat_2_rad = np.radians(lat_2)  
        lon_2_rad = np.radians(lon_2)  
      
        delta_lat = lat_2_rad - lat_1_rad  
        delta_lon = lon_2_rad - lon_1_rad  
      
        R = 6371  # Radius of the earth in km  
      
        return 2*R*np.asin(np.sqrt(np.sin(delta_lat/2)**2 + np.cos(lat_1_rad)*np.cos(lat_2_rad)*(np.sin(delta_lon/2))**2))  
      
      
    if __name__ == '__main__':  
        # Load in flight data to a dataframe  
        flight_data_file = r"./data/2025_flight_data.csv"  
        flights_df = pd.read_csv(flight_data_file)  
      
        # Start timer to see how long analysis takes  
        start = time.time()  
      
        # Calculate the haversine distance between each flight's start and end airport  
        flights_df["Distance"] = haversine(lat_1=flights_df["LATITUDE_ORIGIN"],  
                                           lon_1=flights_df["LONGITUDE_ORIGIN"],  
                                           lat_2=flights_df["LATITUDE_DEST"],  
                                           lon_2=flights_df["LONGITUDE_DEST"])  
      
        # Get result by grouping by origin airport, taking the average flight distance and      printing the top 5  
        result = (  
            flights_df  
            .groupby('DISPLAY_AIRPORT_NAME_ORIGIN').agg(avg_dist=('Distance', 'mean'))  
            .sort_values('avg_dist', ascending=False)  
        )  
      
        print(result.head(5))  
      
        # End timer and print analysis time  
        end = time.time()  
        print(f"Took {end - start} s")

    Running this code gives the following output:

                                            avg_dist
    DISPLAY_AIRPORT_NAME_ORIGIN                     
    Pago Pago International              4202.493567
    Guam International                   3142.363005
    Luis Munoz Marin International       2386.141780
    Ted Stevens Anchorage International  2246.530036
    Daniel K Inouye International        2211.857407
    Took 0.5649983882904053 s

    These results are identical to the results from before the code was optimized, but instead of taking nearly three minutes to process, it took just over half a second!

    Looking Ahead

    If you are reading this from the future (late 2026 or beyond), check if you are running Python 3.15 or newer. Python 3.15 is expected to introduce a native sampling profiler in the standard library, offering similar functionality to py-spy without requiring external installation. For anyone on Python 3.14 or older py-spy remains the gold standard.

    This article explored a tool for tackling a common frustration in data science — a script that functions as intended, but is inefficiently written and takes a long time to run. An example script was provided to learn which US departure airports have the longest average flight distance according to the Haversine distance. This script worked as expected, but took almost three minutes to run.

    Using the py-spy Python profiler, we were able to learn that the cause of the inefficiency was the use of the iterrows() function. By replacing iterrows() with a more efficient vectorized calculation of the Haversine distance, the runtime was optimized from three minutes down to just over half a second.

    See my GitHub Repository for the code from this article, including the preprocessing of the raw data from BTS.

    Thank you for reading!

    Data Sources

    Data from the Bureau of Transportation Statistics (BTS) is a work of the U.S. Federal Government and is in the public domain under 17 U.S.C. § 105. It is free to use, share, and adapt without copyright restriction.

    code Guide Profiling PySpy Python Slow
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Awais
    • Website

    Related Posts

    Ratio-Aware Layer Editing for Targeted Unlearning in Vision Transformers and Diffusion Models

    March 17, 2026

    Generalizing Real-World Robot Manipulation via Generative Visual Transfer

    March 17, 2026

    CLAG: Adaptive Memory Organization via Agent-Driven Clustering for Small Language Model Agents

    March 17, 2026

    Follow the AI Footpaths | Towards Data Science

    March 17, 2026

    Frequency-Aware Planning and Execution Framework for All-in-One Image Restoration

    March 17, 2026

    Hallucinations in LLMs Are Not a Bug in the Data

    March 16, 2026
    Leave A Reply Cancel Reply

    Top Posts

    At Least 32 People Dead After a Mine Bridge Collapsed Due to Overcrowding

    November 17, 20250 Views

    Here’s how I turned a Raspberry Pi into an in-car media server

    November 17, 20250 Views

    Beloved SF cat’s death fuels Waymo criticism

    November 17, 20250 Views
    Don't Miss

    Post, Story, and Reels Dimensions

    March 17, 2026

    A few months ago, I created an Instagram Reel that looked great when I was…

    How nonprofits can build a digital presence that actually drives impact

    March 17, 2026

    How Google Profits From Demand You Already Own

    March 17, 2026

    Extra-Creamy Deviled Eggs Recipe | Epicurious

    March 17, 2026
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews

    Vibe Coding Plugins? Validate With Official WordPress Plugin Checker

    March 17, 2026

    Generalizing Real-World Robot Manipulation via Generative Visual Transfer

    March 17, 2026
    Most Popular

    13 Trending Songs on TikTok in Nov 2025 (+ How to Use Them)

    November 18, 20257 Views

    How to watch the 2026 GRAMMY Awards online from anywhere

    February 1, 20263 Views

    Corporate Reputation Management Strategies | Sprout Social

    November 19, 20252 Views
    Our Picks

    At Least 32 People Dead After a Mine Bridge Collapsed Due to Overcrowding

    November 17, 2025

    Here’s how I turned a Raspberry Pi into an in-car media server

    November 17, 2025

    Beloved SF cat’s death fuels Waymo criticism

    November 17, 2025

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest YouTube Dribbble
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms & Conditions
    • Disclaimer

    © 2025 skytik.cc. All rights reserved.

    Type above and press Enter to search. Press Esc to cancel.