I tried to find MongoDB connection strings over 1000 public GitHub repositories
I tried to see if I could get other people Mongo Database connection string by just searching for it on GitHub search. Yes, I found a few.
I tried connecting to a few and yes, it worked!
Before you call the cops on me, listen to my backstory. 🤗
I was working on a NodeJS/Express application for practice and I remembered I pushed the .env
file to my remote repository. While working on fixing this error, I thought about how many people would have made this error and it is going to stay somewhere in the commit histories even if the secrets eventually get unstaged.
So I took the bait and made this GitHub search. While most of the results are not an actual connection string, a good number of them are still alive and functional.
[DISCLAIMER: NO HARM INTENDED, THIS IS JUST TO CREATE A PUBLIC AWARENESS]
How I scanned through the 1000 repositories
Actually, GitHub Search API limits to 1,000 results for each search. Using the scripts below, I was able to generate repositories whose code included mongodb+srv:
// index.ts
import dotenv from "dotenv"
dotenv.config()
import axios from "axios";
import fs from "fs/promises";
import cliProgress from "cli-progress";
const jsonpath = "list_of_repo.json";
const makeSearch = async (page: number) => {
const config = {
headers: {
Authorization: `Token ${process.env.GITHUB_API_TOKEN}`,
},
};
const url = `https://api.github.com/search/code?q=mongodb%2Bsrv+in:file&page=${page}&per_page=100`;
const result: {
items: { html_url: string; repository: { html_url: string } }[];
} = await axios.get(url, config);
// make an an object from result
let obj = {};
result.data.items.forEach((item) => {
obj[item.repository.html_url] = item.html_url;
});
await addToJson(jsonpath, obj);
};
async function addToJson(jsonpath: string, data?: object) {
const oldJson = (await fs.readFile(jsonpath)).toString();
let jsonData = JSON.stringify(data, null, 2);
if (oldJson) {
jsonData = JSON.stringify(
{ ...JSON.parse(oldJson), ...JSON.parse(jsonData) },
null,
2
);
}
await fs.writeFile(jsonpath, jsonData);
}
async function main() {
// I included a CLI progress loader because, who doesn’t like a loader.
const bar1 = new cliProgress.SingleBar(
{},
cliProgress.Presets.shades_classic
);
// number of iteration 10
bar1.start(10, 0);
for (let i = 1; i <= 10; i++) {
await makeSearch(i);
bar1.update(1);
}
bar1.stop();
}
main();
The results provided does not mean that an actual MongoDB connection string exists, it only implies that the repositories in the result have an in-file code that matches mongodb+srv:
I could go further to create a script to run a search through each code URL and run a regex to further find an actual connection string but that won’t be necessary as my purpose is to create public awareness and how to protect ourselves.
What I discovered and how we can protect ourselves
Some of my discoveries include:
some of the results include old commits in the commit history: Just like my mistake that led to this article, sometimes we forget to create a
.gitignore
file at the beginning of a project and have some secrets staged somewhere in the commit history.We can make use of tools like GitGuardian to continually scan our repo for secrets in our source code.
some results included messages from different log files and environment files: This probably happened due to not including a
.gitignore.
GitHub provides a repo with numerous type of
.gitignore
templates for different language, framework, tools, IDE e.t.c.And I created a simple interactive CLI to generate
.gitignore
templates based on the GitHub lists.
You can find the Interactive CLI tool to generate your .gitignore
templates here: https://www.npmjs.com/package/gittyignore
Thanks for reading through! 🤗
You can buy me a coffee if you enjoy this article!