But isn't it fair to say that we can (and do) use the scientific method to test individual elements of evidence for some historical event?
Absolutely yes. And such individual tests can be undertaken in questions of historical authenticity. It's important to recognize the difference between these small-scoped individual tests, which often
are susceptible to the hypothetico-deductive method, and the broad overall question, which is inductive.
Let's say a boot is found at the site of a purported battlefield. A comparative morphological examination could reveal it to be consistent with boots made for Prussian soldiers in the late 1700s. That is "soft" science, but often helpful and revealing. Additionally the boot leather could be subjected to radiocarbon dating. The narrow hypothesis here might be, "The boot was made in the late 1700s." The deductive part of the H-D process is to deduce that if the boot was made in that time, the carbon isotope ratios will be consistent with that age. In that way, we propose this (of possibly several) methods as a way to test the hypothesis empirically. The falsification would occur if the ratios were reliably computed for the sample but were not consistent with a late 1700s date. But if the ratio matches to an acceptable epsilon, our prior deduction now allows us to assert the hypothesis.
This might be part of a larger historical question such as, "Did Prussian soldiers fight on this battlefield in 1780?" The scientific confidence in the age of the boot and the comparative assurance of its style may surely convince us that at least a Prussian boot was here. Whether the historical question is confirmed at the larger scale would require other examinations such as whether the bulk of evidence (not just the boot, but perhaps also uniforms, ammunition, Prussian money, etc.) induces to that conclusion. If the evidence is just the boot then you would entertain other possibilities such as a Prussian boot borrowed or captured. In the end the large question is entirely inductive, even though it has been considered according to several deductive steps.
A similar process is followed for scientific evidence in court, such as crime-scene analysis. If we hypothesize that the accused was at the crime scene, we can deduce that he left behind evidence such as DNA, fingerprints, footprints. The null hypothesis is that the accused is innocent, which implies the null sub-hypothesis that he was not at the crime scene. The comparison of DNA is a reasonably reliable process, hence we deduce further that if a DNA match occurs, it is because the DNA-bearing material was physically there and not because of a fault in the process. If we find a match, the null sub-hypothesis is falsified -- the accused was evidently at the crime scene. A jury may decide later that this and other evidence is best considered inductively by the parsimonious verdict that he is guilty.
That the individual elements are deductive indicates only the form of reasoning we use to structure the experiment. It does not guarantee that the result is deductively strong. Deductions would be something like, "DNA is unique to a person; the sample matches the subject, therefore the sample came from the subject." That doesn't rule out laboratory error or prosecutorial misconduct. But if the results are contested in that way, the defense bears the burden of proof for that point alone. Jury instructions are explicit in cases like that, and all the courtroom mechanisms that vary based on who carries the burden of proof shift to favor the prosecution.
Romulus' unique formulation of methodology seems to center upon his simple mention of an alternative as an effective defense or rebuttal, without any further substantiation. The pro-Apollo camp not having affirmatively and conclusively refuted his speculative alternatives to him constitutes an abrogation of the original burden of proof and therefore somehow unscientific.