The Titanic Analysis Project is a project I would like to go in-depth with as it is one of my proudest projects and it showcases my growth in Python. In this project, my teacher and I had to make a prediction based on the data provided by each and every boarded passenger on the Titanic. We used Dataframe from Panda to make this project. The prediction we had to make was the percentage of people who survived the " Titanic Crash " based on features such as sex, age, and P-class. We had to create a Dataframe containing categories and counts of each feature provided in the data. Our strategy to make this project was to add one feature at a time to test if it was increasing or decreasing the overall percentage rate. We tested all possible combinations with the attributes provided in the data and came to a final conclusion of the percentage.
This is the code used to make the conclusive prediction:
def predictions_4(data):
""" Model with three features (Sex, Age, Pclass, Embarked):
- Predict a passenger survived if they are female and Pclass is 1 or 2 or embarked is C
- Predict a passenger survived if they are male and younger than 10. """
predictions = []
for index, passenger in data.iterrows():
# Remove the 'pass' statement below
# and write your prediction conditions here
if passenger['Sex'] == 'female':
if (passenger['Pclass'] == 1 or passenger['Pclass'] == 2) or passenger['Embarked'] == 'C':
predictions.append(1)
else:
predictions.append(0)
elif (passenger['Sex'] == 'male'):
if (passenger['Age'] < 10):
predictions.append(1)
else:
predictions.append(0)
# Return our predictions
return pd.Series(predictions)
# Make the predictions
predictions = predictions_4(full_data)
The final " accuracy score prediction percentage " was: 80.13%.