Okay, here’s my blog post about my experience with Miomir Kecmanovic. Let’s dive right in!

Alright folks, so I recently decided to dig into some tennis data and see if I could glean any insights. I zeroed in on Miomir Kecmanovic because, well, his name caught my eye and I was curious. I started by gathering match data from a couple of different sports data websites. It was messy, I ain’t gonna lie.
First off, I spent a solid chunk of time just cleaning the data. I mean, you get dates in different formats, player names sometimes abbreviated, sometimes full, and all kinds of inconsistencies. I used Python with Pandas to wrestle that beast into shape. Lots of .replace()
and .to_datetime()
calls, let me tell ya. I was swearing at my screen a few times, but hey, that’s part of the fun, right?
Next, I wanted to see Kecmanovic’s win/loss record over the past few years. Pretty standard stuff. I grouped the data by year and counted wins and losses. I also calculated his winning percentage. Nothing earth-shattering, but it gave me a baseline. Used Matplotlib to whip up a quick bar chart. Looked kinda decent, if I do say so myself.
Then, I got a bit more ambitious. I started looking at his performance on different court surfaces: clay, hard, grass. I split the data accordingly and calculated his winning percentage on each surface. Turns out, he does noticeably better on hard courts. Interesting! I visualized this with another bar chart, this time with different colors for each surface.
Here’s where it got a little trickier. I wanted to see if his performance varied based on the opponent’s ranking. I grabbed opponent ranking data (more web scraping, yay!) and added it to my existing dataset. Then I created categories of opponent rankings (e.g., top 10, top 50, top 100) and calculated Kecmanovic’s win percentage against each category.
- Top 10: Struggles a bit, as expected.
- Top 50: Pretty solid win rate.
- Top 100: Does well, mostly.
I even tried to build a simple predictive model using scikit-learn. I used a Logistic Regression model to predict whether he would win a match based on his ranking, his opponent’s ranking, and the court surface. The accuracy wasn’t amazing, maybe around 65%, but hey, it was a first attempt. I probably need to add more features and tune the model better.
Lessons Learned
Honestly, the biggest takeaway was how much time data cleaning takes! Like, 80% of the project was just getting the data into a usable format. But it was a good learning experience. I got more comfortable with Pandas and Matplotlib, and I even dipped my toes into some basic machine learning.
I’m thinking of expanding this project by looking at his serve statistics (aces, double faults) and how they correlate with his match outcomes. Also, maybe compare his performance to other players with similar rankings. Lots more to explore! Let me know if you have any suggestions.

That’s all for now, folks! Hope you found this interesting. Until next time!