regarding validating XML against DTD
hello,
In my project I am receiving xml via HTTP post request
and this XML needs to be validated against a DTD in a remote server.
e.g. assume the xml to be
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE PUSH SYSTEM "http:\\whatever:xx\whatever\sms.dtd">
<PUSH ICP="Partenaire" ADM="UtilisateurChezPartenaire" VERSION="1.0">
..........
..........
..........
</PUSH>
the java code uses Xerces parser
DOMParser parser =new DOMParser();
parser.setFeature("http://xml.org/sax/features/validation",true);
parser.setProperty(
"http://apache.org/xml/properties/schema/external-noNamespaceSchemaLocation",
"http:\\whatever:xx\whatever\sms.dtd"");
parser.parse("c://localcopy/sms.xml");
this Works fine.
But in some case I receive xml file without any !DOCTYPE declaration
(but is still needs to be validated against the same DTD as its mentioned in the business rules)
in such case how can the XML be validated against the DTD.
am I extected to add the
<!DOCTYPE PUSH SYSTEM "http:\\whatever:xx\whatever\sms.dtd">
to every XML via some XSLT script or is there a direct way of
validating a xml that has no DOCTYPE reference to a DTD
(the assumption is the DTD location is known beforehand)
[1574 byte] By [
angeshwara] at [2007-11-26 14:56:42]

If it doesn't have the doctype specification, then it's not valid off the bat. Everything else may be fine but it's already wrong.
Essentially you or whoever has published an api that says,
1. You will have a doctype referencing this particular dtd.
2. You will conform to that dtd in your xml
Failure on either of these points means the xml is invalid and in my opinion you reject it.
Now that said, you may not be able to enforce that rule. You SHOULD be able to but you may not. So you may have to work without the dtd but I would make it very clear to the powers that be that you're now working around what amounts to laziness on the part of someone else and that it's bad practice to work around other people's screw ups. Better by far to fix it at the source.
Just my 2 krupplenicks, your milage may of course vary.
PS.
hi,
i get your point, anyway let me make exlain a bit more abt my project.
i will connect to a telecom service provider's server and the service
provider will send SMS to my server via HTTP post, i will receive them, send
and acknowledgement/error (which by the way is again sent as XML over HTTP and eacjh has its own DTD's).
i do the opposite when sending SMS back to the service provider,
i.e. i create a xml based on the DTD, send it via HTTP post and get
an ack/error
comming to your your second point
'You will conform to that dtd in your xml'
thats true, but that doesnt mean i can skip the verification part to see
if the incomming XML adhers to the DTD! the DTD was provided by the service provider
so we can assume every xml i receive from him must adher to the DTD, but still ive got to make sure every xml i receive/send via my server adhers to that DTD
so this is not someone else's layiness but mearely just another way to send SMS securely without corrupting the data . so nothing much to fix at the source
anyway let me know if you have a solution to the problem
i havnt found a solution to validate a XML that doesnt have a DOCTYPE.
right now i am manually adding a DOCTYPE reference to the XML via a XSLT, then validate and then again remove the DOCTYPE via another xslt
this doesnt seem to be a right approch, since some forums say that xslt transformation consumes cpu, memory and shld be avoided whenevr possible.
besides the xml files i receiev via HTTP post are sms data , i.e. there will be 1000s of such xml data receievd e.g. @3000sms every 12 seconds.
so transforming via xslt is not a good option, let me know if there is any other simpler method.
Instead of processing these DTD-less XML files via XSLT, why not simply put the declaration at the start by using file access. And why do you have to remove it again, at least it shouldn't make the XML invalid or something.
the xml i receive has been constructed by the telecom service provider, the constructed xml does not have a DOCTYPE,
when i recieve themi have to validate them against the DTD (provided by the service provider) based on the data in the xml i will have to send an ack/error or appropriate message in a XML via HTTP post,
so either way i need to validate a xml that doesnt have a doctype in the most efficient way possible.
hello,
does any have a solution for my problem?
i have over 5 DTds against which the xml needs to be validated, it doesnt make much sense in manually trying to add the appropriate DOCTYPE to the xml , everytime I want to check against a DTD!!!
there has to be a easier/simpler method to avoid repeated transformation over xslt....
ok, i managed to solve the problem my self, using Transformer makes the job easier.
here is the code for anyone who might run into the same problem.#
public void whatEver(){
try{
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document xmlDocument = dBuilder.parse(new FileInputStream("c://a.xml"));
DOMSource source = new DOMSource(xmlDocument);
//StringWriter writer = new StringWriter();
StreamResult result = new StreamResult("c://a.xml");
TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer = tf.newTransformer();
transformer.setOutputProperty(OutputKeys.DOCTYPE_SYSTEM, "./transformations/Push_Gateway.dtd");
//TransformerFactory transformerFactory = TransformerFactory.newInstance();
//Transformer newtransformer = transformerFactory.newTransformer();
transformer.transform(source, result);
}
catch(Exception e){
e.printStackTrace();
}
}
the above code reads the a.xml file and adds/removes the DOCTYPE as specified in the transformer.setOutputProperty method and the same xml file is updated, this way different DTD could be referenced by the same xml and validated. this saves the process of adding/removing DOCTYPE via xslt.
Hi
I am trying to use the code you have posted... some how its not validating preoperly. May be because i am using some wrong jar file(xml_apis_2_7_1.jar) whcih is kind of our internal project jar file.
Can you specify which jar file for the Transformer you have used.
and other classes(Factory, etc....) you have given in sample
Thanks,
Nag