Neural Networks Explained: Flight Similarity Prediction
By Daniel F Cubides
Neural Networks Explained: Flight Similarity Prediction
Hey there, tech enthusiasts and curious minds! 👋 Today we’re diving into the fascinating world of neural networks with a real example: predicting flight similarities! Don’t worry if you’re new to this - I’ll break it down so it’s as easy as ordering your favorite pizza.
🧠 Neural Networks: The Basics
Imagine your brain for a second. It’s made up of billions of neurons that send signals to each other, helping you recognize patterns, make decisions, and learn new things. Artificial Neural Networks (ANNs) are inspired by this amazing biological design!
What’s a Neural Network, Really?
A neural network is like a team of digital neurons working together to solve problems. Here’s how they work in simple terms:
- Input Layer: This is where we feed our data (like flight details in our example)
- Hidden Layers: These middle layers do the heavy lifting - processing the information
- Output Layer: This gives us our answer (like “yes, these flights are similar” or “nope, totally different”)
Each “neuron” in these layers takes information, applies a mathematical function to it, and passes the result to the next layer. It’s kind of like a relay race of calculations!
Why Are Neural Networks So Cool?
Neural networks are amazing because they can:
- Learn from examples (just like humans!)
- Find patterns that might be invisible to the human eye
- Make predictions about things they’ve never seen before
- Handle messy, real-world data
🔍 Let’s Analyze Our Flight Similarity Code
Now, let’s look at the code we have for predicting flight similarities! This model helps determine when two flights might be related or substitutable for each other.
Feature Engineering: Making Sense of Flight Data
First, our code transforms raw flight data into useful features:
def feature_engineering(df):
# Make a copy to avoid modifying the original
df_copy = df.copy()
# Convert times to numerical values (minutes)
df_copy['flight_1_minutes'] = df_copy['Flight_1_time'].apply(convert_time_to_minutes)
df_copy['flight_2_minutes'] = df_copy['Flight_2_time'].apply(convert_time_to_minutes)
# Create new features
# Price difference between flights
df_copy['price_diff'] = abs(df_copy['flight_1_price'] - df_copy['flight_2_price'])
df_copy['price_ratio'] = df_copy['flight_1_price'] / df_copy['flight_2_price']
# Time difference between flights
df_copy['time_diff'] = abs(df_copy['flight_1_minutes'] - df_copy['flight_2_minutes'])
# Route similarity features
df_copy['same_origin'] = (df_copy['flight_1_from'] == df_copy['flight_2_from']).astype(int)
df_copy['same_destination'] = (df_copy['flight_1_to'] == df_copy['flight_2_to']).astype(int)
df_copy['reversed_route'] = ((df_copy['flight_1_from'] == df_copy['flight_2_to']) &
(df_copy['flight_1_to'] == df_copy['flight_2_from'])).astype(int)
# Features for connections
df_copy['possible_connection'] = ((df_copy['flight_1_to'] == df_copy['flight_2_from']) &
(df_copy['flight_1_minutes'] < df_copy['flight_2_minutes'])).astype(int)
# Drop the original time columns as we now have numerical versions
df_copy.drop(['Flight_1_time', 'Flight_2_time'], axis=1, inplace=True)
return df_copy
This is like preparing ingredients before cooking! We’re transforming raw flight data into something our neural network can digest.
The Neural Network Architecture
Let’s look at the heart of our code - the neural network model:
def build_similarity_model(input_dim):
"""Build a neural network model for flight similarity prediction."""
model = Sequential([
Dense(64, activation='relu', input_dim=input_dim),
Dropout(0.3),
Dense(32, activation='relu'),
Dropout(0.3),
Dense(32, activation='relu'),
Dropout(0.3),
Dense(32, activation='relu'),
Dropout(0.3),
Dense(32, activation='relu'),
Dropout(0.3),
Dense(1, activation='sigmoid') # Output between 0-1 for similarity
])
model.compile(
optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy', tf.keras.metrics.AUC(), tf.keras.metrics.Precision(), tf.keras.metrics.Recall()]
)
return model
This creates a neural network with:
- An input layer that takes in our flight features
- Five hidden layers with varying numbers of neurons
- “Dropout” layers that help prevent overfitting (more on this later!)
- An output layer that gives us a similarity score between 0 and 1
🤔 Why This Neural Network Works
Our flight similarity neural network has some smart design choices:
Multiple Hidden Layers (Deep Learning)
The model uses five hidden layers, making it what we call a “deep” neural network. These multiple layers help the network learn increasingly complex patterns:
- Earlier layers might learn basic relationships like “similar departure times”
- Middle layers might combine these into patterns like “similar route segments”
- Later layers can abstract even further to complex relationships
Think of it like this: If you were trying to spot similar flights, you’d first check basic things (same airports?), then look at more complex patterns (similar times AND prices?), and finally make a judgment combining everything. Each layer in our network does something similar!
ReLU Activation: The Secret Sauce
Each hidden layer uses the ReLU (Rectified Linear Unit) activation function. It’s like a bouncer at a club that says “if the value is negative, make it zero; otherwise, let it through.”
# ReLU in simple terms:
def relu(x):
return max(0, x)
This simple function helps the network learn faster and avoid the “vanishing gradient problem” that plagued older neural networks.
Dropout Layers: Fighting Overfitting
Those Dropout(0.3) layers are super important! They randomly turn off 30% of the neurons during training, which:
- Forces the network not to rely too much on any single neuron
- Acts like training multiple smaller networks
- Improves generalization (how well it works on new data)
It’s like studying for a test where you know some pages will be missing - you have to understand the whole subject, not just memorize specific facts!
Sigmoid Output: Perfect for Similarity
The final layer uses a sigmoid activation function, which squishes any value to between 0 and 1:
# Sigmoid in simple terms:
def sigmoid(x):
return 1 / (1 + math.exp(-x))
This is perfect for similarity prediction since we want a percentage-like score:
- 0 means “completely different flights”
- 1 means “these flights are practically identical”
🎯 How Many Layers Should You Use?
The million-dollar question! This model uses five hidden layers, but is that the right number? Here’s a simple guide:
Layer Count Guidelines:
- Simple problems (like linear classification): 1-2 hidden layers
- Medium complexity (like image recognition): 3-5 hidden layers
- Complex problems (like natural language processing): 5+ hidden layers
For our flight similarity model, five layers might be overkill for some datasets, but it gives the model plenty of capacity to learn complex relationships between flights.
The Goldilocks Principle: Not Too Few, Not Too Many
Choosing the number of layers is like the story of Goldilocks - you want it just right:
- Too few layers: Your model might be too simple to capture complex patterns (underfitting)
- Too many layers: You might overfit to your training data or face vanishing gradient problems
In our code, we can see the model has a reasonable architecture:
- It starts with 64 neurons in the first hidden layer
- Then has four more layers with 32 neurons each
- Each layer is followed by dropout to prevent overfitting
This is a good approach for moderately complex problems like flight similarity!
How to Choose in Practice
There’s no magic formula, but these principles can help:
- Start small and add layers if the model isn’t learning well enough
- Watch for diminishing returns - sometimes more layers just waste computing power
- Consider your data volume - deeper networks need more training data
- Monitor for overfitting - if your model does great on training data but poorly on new data, you might need fewer layers or more dropout
The code also uses some great techniques to help with this:
callbacks = [
tf.keras.callbacks.EarlyStopping(patience=10, restore_best_weights=True),
tf.keras.callbacks.ReduceLROnPlateau(factor=0.2, patience=5)
]
The EarlyStopping callback stops training when the model stops improving, which helps prevent overfitting. And ReduceLROnPlateau reduces the learning rate when progress stalls, helping the model fine-tune its weights.
🚀 Practical Tips for Your Own Neural Networks
If you’re building your own neural network for a similar problem, here are some fun tips:
Start simple! Begin with 1-2 hidden layers and see how it performs before adding complexity.
Feature engineering is king! Good features (like our flight comparison features) can make even a simple model perform well.
Always use dropout - it’s like magic fairy dust that prevents overfitting. The 30% rate in our code is a good starting point.
Monitor training curves - if your training accuracy keeps improving but validation doesn’t, you’re probably overfitting.
Experiment with layer sizes - a common pattern is to start wider (more neurons) and get narrower as you go deeper, like our model does (64 → 32 → 32 → 32 → 32).
Code Analysis: Alternative Architectures
Let’s look at how we might create simpler or more complex versions of our model:
A Simpler Model (2 Hidden Layers)
def build_simpler_model(input_dim):
"""A simpler model with fewer layers."""
model = Sequential([
# First hidden layer: 64 neurons
Dense(64, activation='relu', input_dim=input_dim),
Dropout(0.3),
# Second hidden layer: 32 neurons
Dense(32, activation='relu'),
Dropout(0.3),
# Output layer
Dense(1, activation='sigmoid')
])
model.compile(
optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy']
)
return model
When to use this simpler model:
- When you have a smaller dataset (fewer than 1000 examples)
- When the relationship between features and similarity is fairly straightforward
- When you need faster training times
- As a baseline to compare against more complex models
A More Complex Model (7 Hidden Layers)
def build_deeper_model(input_dim):
"""A deeper model with more layers."""
model = Sequential([
# First hidden layer
Dense(128, activation='relu', input_dim=input_dim),
Dropout(0.3),
# More hidden layers with decreasing neuron counts
Dense(64, activation='relu'),
Dropout(0.3),
Dense(64, activation='relu'),
Dropout(0.3),
Dense(32, activation='relu'),
Dropout(0.3),
Dense(32, activation='relu'),
Dropout(0.3),
Dense(16, activation='relu'),
Dropout(0.3),
Dense(16, activation='relu'),
Dropout(0.3),
# Output layer
Dense(1, activation='sigmoid')
])
model.compile(
optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy']
)
return model
When to use this deeper model:
- With large datasets (10,000+ examples)
- When you have many features with complex interactions
- When simpler models aren’t capturing the patterns well
- When you have the computational resources for longer training times
🎭 Conclusion
Neural networks might seem like magic, but they’re really just clever mathematical models inspired by our own brains. The flight similarity prediction model we explored demonstrates many best practices:
- Thoughtful feature engineering
- Multiple hidden layers to learn complex patterns
- Dropout for regularization
- Appropriate activation functions
Whether you’re predicting flight similarities, recognizing faces, or translating languages, these principles apply across the neural network universe!
Happy learning, and may your neural networks always converge! 🧠✨