Objective: Coronary artery calcium (CAC) score is a strong predictor for future adverse cardiovascular events. Anthropomorphic phantoms are often used for CAC studies on computed tomography (CT) to allow for evaluation or variation of scanning or reconstruction parameters within or across scanners against a reference standard. This often results in large number of datasets. Manual assessment of these large datasets is time consuming and cumbersome. Therefore, this study aimed to develop and validate a fully automated, open-source quantification method (FQM) for coronary calcium in a standardized phantom. Materials and Methods: A standard, commercially available anthropomorphic thorax phantom was used with an insert containing nine calcifications with different sizes and densities. To simulate two different patient sizes, an extension ring was used. Image data were acquired with four state-of-the-art CT systems using routine CAC scoring acquisition protocols. For interscan variability, each acquisition was repeated five times with small translations and/or rotations. Vendor-specific CAC scores (Agatston, volume, and mass) were calculated as reference scores using vendor-specific software. Both the international standard CAC quantification methods as well as vendor-specific adjustments were implemented in FQM. Reference and FQM scores were compared using Bland-Altman analysis, intraclass correlation coefficients, risk reclassifications, and Cohen’s kappa. Also, robustness of FQM was assessed using varied acquisitions and reconstruction settings and validation on a dynamic phantom. Further, image quality metrics were implemented: noise power spectrum, task transfer function, and contrast- and signal-to-noise ratio among others. Results were validated using imQuest software. Results: Three parameters in CAC scoring methods varied among the different vendor-specific software packages: the Hounsfield unit (HU) threshold, the minimum area used to designate a group of voxels as calcium, and the usage of isotropic voxels for the volume score. The FQM was in high agreement with vendor-specific scores and ICC’s (median [95% CI]) were excellent (1.000 [0.999-1.000] to 1.000 [1.000-1.000]). An excellent interplatform reliability of κ = 0.969 and κ = 0.973 was found. TTF results gave a maximum deviation of 3.8% and NPS results were comparable to imQuest. Conclusions: We developed a fully automated, open-source, robust method to quantify CAC on CT scans in a commercially available phantom. Also, the automated algorithm contains image quality assessment for fast comparison of differences in acquisition and reconstruction parameters.