All Articles

How I used Logistic Regression to Calculate Compatibility Scores in a Dating App

A few years back, I worked on a project in the dating space where we wanted to go beyond random profile suggestions. The idea was simple: when a user selects a mood like Long-Term Relationship, Friend, or Outing, they answer a few short questions. Then, as they browse other profiles, each card would display a compatibility score, showing how well they matched with that person based on their answers.

That was my first attempt at using logistic regression in a real-world product.

What is Logistic Regression

Logistic regression is a statistical model that helps you predict a probability - a number between 0 and 1. It’s widely used when you want to answer "how likely is this to happen?".

In this project, I used it to estimate:

How likely are two people to be compatible, given their answers and preferences?

The beauty of logistic regression is that it converts a weighted sum of inputs (questions, preferences, etc.) into a clean probability value using a sigmoid curve - the famous "S-shaped" function that maps any number into a range between 0 and 1.

alt

The idea was simple:

You pick your mood - say Long-Term Relationship, Friend, or Outing - and based on that mood, you answer a few questions. Then, when another user browses profiles, each card shows a compatibility score calculated from how both users answered those mood-based questions.

alt

That’s when I tried to use logistic regression to calculate the score.

Why Logistic Regression?

You might wonder "Isn't logistic regression just for classification?" That's true, it's usually used to predict categories like yes/no or match/no match. But here, we're doing something a little different. Logistic regression naturally gives a probability between 0 and 1. Instead of turning it into a simple yes or no, we use that probability as a compatibility score. So when you see a score on a user card, it's essentially the model saying, "Here's how likely you are to be a good match," in a smooth, continuous way.

This approach lets us go beyond just saying "match" or "no match' and gives a more nuanced view of compatibility.

How It Worked

For each mood (like Long Term Relationship, Friend, Outing), we had a different set of questions, things like:

  • What are you looking for?
  • Do you smoke or drink?
  • How do you like to spend your weekends?
  • What’s your preferred age range?

Each question had an odds (weight) value stored in the database. I manually set these odds based on how strongly a question might influence compatibility. For example:

  • What are you looking for? Higher (≈1.6–1.7)
  • Do you smoke or drink? Higher (≈1.6–1.7)
  • How do you like to spend your weekends? Medium (≈1.4–1.6)
  • What's your preferred age range? Medium (≈1.4–1.6)
  • What's your zodiac sign Neutral (1.0)

Those odds became the backbone of the scoring formula.

The Compatibility Formula

For each pair of users:

  1. Compare their answers for every question.
  2. Mark it as isMatch = 1 if both answers align, else 0.
  3. Multiply that with the log of the odds(stored in DB) and sum everything up.

The math behind it:

z=i(isMatchi×log(oddsi))z = \sum_i (\text{isMatch}_i \times \log(\text{odds}_i))
score=11+ez\text{score} = \frac{1}{1 + e^{-z}}

Where:

  • isMatch_i = 1 if the answers match, 0 otherwise.
  • odds_i = the weight or importance of each question.
  • z = combined weighted sum of all matches.
  • score = final compatibility probability (0 to 1).

So, if two users had many matching answers, especially on questions with high odds, their compatibility score leaned closer to 1.

A Glimpse from the Code

Here’s the main logic that tied everything together:

async function getCompatibilityScore(user1Id, user2Id, moodId) {
  const { profile: user1, questions: q1 } = await loadCompatibilityData(user1Id, moodId);
  const { profile: user2, questions: q2 } = await loadUser2CompatibilityData(user2Id, moodId);

  let profileSum = 0, partnerSum = 0;

  // Profile-based matching (age, gender preference)
  profileSum = profileQuestions.reduce((acc, question) => {
    const isMatch = isProfileQuestionMatch(question, user2);
    return acc + Number(isMatch) * Math.log(question.get('odds'));
  }, 0);

  // Mood question matching (smoking, relationship type, etc.)
  for (let i = 0; i < q1.length; i++) {
    const qMatch = q2.find(q => q.get('id') === q1[i].get('id'));
    if (!qMatch) continue;
    const isMatch = isPartnerMatch(q1[i], qMatch, user2);
    partnerSum += Number(isMatch) * Math.log(q1[i].get('mood_question_compatibilities')[0].get('odds'));
  }

  // Logistic regression formula
  const score = 1 / (1 + Math.exp(-1 * (partnerSum + profileSum)));
  return score.toFixed(2);
}

Caching was done via Redis for performance, so users could see their compatibility instantly while swiping through cards.

Applying What You Learn

Back then, I didn't have a lot of user data, so I relied on intuition to set the initial odds in the database. As more users joined, these weights could be fine-tuned or even learned from real match feedback.

Sometimes the things you learn feel like they'll never be useful and then, one day, it clicks perfectly, and you get a chance to actually implement it in a real product. That moment of connecting theory to practice is incredibly satisfying.

Sometimes, you don’t need deep AI to build something meaningful - just clear logic, good data structure, and a bit of curiosity.