an overview of the usage of different file formats used for 3D printing
During our current research on the security of 3D printers and the surrounding ecosystem, we asked ourselves how well different file formats are used. As there seems to be no clear data, just anecdotal evidence, we decided to check for ourselves.
We analyze which are the most used file formats on two popular 3D model online-marketplaces, namely Thingiverse and MyMiniFactory. The data presented here is based on the publicly available files uploaded to both platforms. The data does not contain any information about the usage of different file formats outside this specific use case of private users sharing their 3D printing model with others; especially not regarding an industrial context. No other data is freely available. The How to Get the Data section below provides download links and how-tos for our dataset. The data presented here was collected in June 2021.
Both Thingiverse and MyMiniFactory provide application programming interfaces (APIs) that allow access to their data sets, specifically JSON-based HTTP REST APIs
The data set for Thingiverse contains more than two million entries, where the set for MyMiniFactory amounts to roughly 130,000 entries. For each object the downloaded metadata includes the file names of all files uploaded to that object. To ease the analysis, we stored the uploaded file names and their upload timestamp for each object. Then, we reduced the file name to their suffix(es) (i.e. their file extension) and unified the them to a lower case version. This means our analysis is limited to the knowledge derived from the file suffixes the uploader used. It might be the case that the uploaded file does not match the actual content. Additionally, we do not analyze the content of uploaded .zip
(or similar) files.
Table 1 lists the file formats that occur the most often in our data sets.
Table 1 Total number of occurrences/uploads of all file formats that occur more than 10,000 times. The suffixes where unified to their lower-case version an the following suffixes were omitted: .pdf
, .zip
, .0
, .1
, .svg
. AMF is included as it is mentioned by various rankings
Suffix | File Format Description | Total Occurrences | Repetition Factor |
---|---|---|---|
.stl | STereoLithography a | 4,592,742 | 2.13 |
.scad | OpenSCAD project file | 77,585 | 1.42 |
.obj | Wavefront Object b | 65,556 | 1.86 |
.step | STandard for the Exchange of Product model data c | 44,920 | 1.72 |
.sldprt | SolidWorks Part file | 43,599 | 2.00 |
.skp | SketchUp project file | 32,522 | 1.48 |
.f3d | Fusion 360 project file | 32,275 | 1.30 |
.fcstd | FreeCAD project file | 21,436 | 1.52 |
.dxf | Drawing Interchange File for AutoCAD d | 20,566 | 1.94 |
.gcode | Toolpath instruction for manufacturing devices e | 16,713 | 1.52 |
.ipt | Inventor project file | 14,905 | 1.96 |
.3mf | 3D Manufacturing Format f | 14,823 | 1.63 |
.blend | Blender project file | 13,720 | 1.61 |
.123dx | 123D project file g | 12,146 | 1.55 |
︙ | ︙ | ︙ | ︙ |
.amf | Additive Manufacturing Format h | 2,451 | 1.54 |
.stp
. [↩︎] Overall, only 3% of objects do not have an associated STL file..obj
, .scad
, and .dxf
. Further, nine of the fifteen listed files are project files for specific programs. Together these facts suggest that the most common use case is for a user to upload an STL file and their project file of the software they created the STL with. Alternatively, the model is uploaded as an OBJ file, or in popular exchange file formats for Computer Aided Design (CAD) software (i.e. .scad
and .dxf
).
To get an overview of the change in usage we plotted the uploads per month of each file format.
As some file formats support multiple models in one file and others do not, we ignore duplicate suffixes on files for the same object that were uploaded on the same day. This sanitization is required, as otherwise there might be biases towards the formats that do not support multiple models in one file, as a user would have to upload multiple files for a complex model with separated parts. This reduces the variance in the repetition factor from Table 1.
As you can see in the graph below, .obj
, .step
, and .f3d
all follow a near identical curve that shows rapid increases in usage. .3mf
shows fewer usage overall, but a rapid increase since its initial release. .sldprt
, .fcstd
, .dxf
, .gcode
, and .blend
show a more steady growth. .ipt
and .amf
both fluctuate more than others and seem more or less stagnant. .skp
, .123dx
are declining in usage. In the case of .123dx
this is expected, since AutoDesk discontinued the 123D program suite in 2016.
Download the data we used (collected in June 2021):
/raw_data
thingiverse.zip
(5.8 GB)myminifactory.zip
(296 KB)/parsed_data
extracted_data.json
(228 MB)failed_opens
states how many source files failed to open (corrupted file). nr_thingiverse_files
and nr_myminifactory_files
state how many files where added from the respective database.file_analysis_raw.json
(6.7 MB) file_analysis.json
(9.4 KB)format_uploads_per_day.json
(1.6 MB)format_uploads_per_day_per_object.json
(1.5 MB)format_uploads_per_day
but uploads of the same type are ignored on the same day and object.format_uploads_per_month_per_object.json
(12 KB)format_uploads_per_day_per_object
but ordered so it can be used in the webpage for the graph. Data is grouped by month and their type.number_of_files_per_object.json
(2.8 KB)/scripts
analyze_data.py
(11 KB)extract_data.py
(2.3 KB)get_data.py
(2.6 KB)plot_data.py
(1.5 KB)Download the data yourself.
As of June 2021 this will produce roughly 40 GB of JSON data and make about 10 million requests. The script creates a file for each available entry containing the JSON metadata. This means there will be millions of files in a single folder. I did it this way because it was the simplest, reasonably fast, method that works well with threading. This will obviously be terribly slow with a slow disk. I used an NVME SSD and had a total execution time of about 12 hours.
If you want to do something less stupid, go ahead and change the script ;) For downloading the data once this was fine.
thingiverse.com
and myminifactory.com
’s APIs. thingiverse.com
myminifactory.com
https://auth.myminifactory.com/web/authorize?client_id=XXX&redirect_uri=YYY&response_type=token&state=RANDOM_STRING
where client_id
should be the name of you app and redirect_uri
the same redirect URI that was given for the registration. I used ngrok
for the callback URI, but I’m not sure you’d actually need that.YYY#access_token=TTT&expires_in=604800&state=RANDOM_STRING&token_type=Bearer
TTT
is the one we need.get_data.py
script with these parameters: 1
and the highest value you can find under “newest” on the respective site.@online{usage-statistics-of-3d-printing-file-formats,
author = {Rossel, Jost},
title = {Usage Statistics of 3D Printing File Formats},
year = 2022,
url = {https://upb-syssec.github.io/blog/2022/3d-printing-file-format-usage/},
urldate = {2024-04-24}
}