This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.
Summary: | Opening file w/ large DTD and lots of other files linked using entities takes 25 minutes! | ||
---|---|---|---|
Product: | xml | Reporter: | _ tboudreau <tboudreau> |
Component: | TAX/Lib | Assignee: | issues@xml <issues> |
Status: | VERIFIED FIXED | ||
Severity: | blocker | CC: | eadams, ffjspb, genero, jchalupa, jglick, jkovar, pkeegan |
Priority: | P2 | Keywords: | PERFORMANCE |
Version: | -S1S- | ||
Hardware: | PC | ||
OS: | Windows XP | ||
Issue Type: | DEFECT | Exception Reporter: | |
Bug Depends on: | 22789 | ||
Bug Blocks: | 27998 |
Description
_ tboudreau
2002-04-05 23:07:23 UTC
Tim please turn off tree editor module to improve performance. Tree editor module is know performance issue mentioned in FAQs. Performance is terrible - P2. Any progress on this issue? Set target milestone to TBD Users ought to be warned about this. No one would want to expand a big XML file. It is also noted in module help. This bug is reported in version <= 3.4dev and still not fixed. Due to that it forbids the release candidate for 3.4 to be promoted. Are you aware of that and are you intensively working on the fix? If not, you should consider some corrective action. XML model (TAX) is not designed for large document -- then I could say it is as designed behaviour, however I do not agree with it. Other work on TAX was for 3.4 postponed, large files tree editing is not our strategic feature. One important thing was fixed: referenced TAX instances are correctly weak - not used references can be released by GC. [Reverting priority back to original P3.] Possible solution: external entities should be parsed (and built TAX) just on demand. We estimated that proper fix implementation requires 3 developer/weeks (rewrote parser and introduce positions). Based on this estime the strategy commite marked XML performance as non strategic issue for Sun's contribution to NetBeans. From Sun's point of view it was classified as nice to have enhancement. From user's point of view it is a bug. I do not know how to express this state using IZ. I still consider this a P2 DEFECT. The fact that no Sun employee plans to work on it for 3.4 is irrelevant; it is a serious performance bug. +1 If we say we have *XML editing* capability, this sets expectations that you will be able to edit *any* XML file. If we want to call it "Small XML file editing" that's another story. It is better that this be a known bug slated to be fixed than to get complaints about what happens when users open XML files and NetBeans hangs for half an hour and have no decent answer for why this is not a priority (I managed to take a shower and cook and eat breakfast in the time it took NetBeans to parse the O'Reilly book XML). I'm taking the initiative (may the gods curse me) to raise the priority to P2. The proper process for this bug is to get a waiver for it for 3.4. > Possible solution: external entities should be parsed (and
> built TAX) just on demand.
Question 1: Was this the solution you evaluated when coming up with
the 3-developer-weeks estimate?
Question 2: How will you determine if an external entity contains
valid XML if you don't parse it? Hypothetical ugly docbook example:
XML File x:
<chapter>
<section>
& Entity 1
</chapter>
Entity y:
Some random text....
</section>
In this case, x is malformed XML unless you parse the document, and
y is malformed XML unless parsed in the context of the document.
I could be wrong, but I don't think this sort of ugliness is
illegal in XML. Also consider the case where x contains the
closing section tags, but y is malformed: as far as I can tell, the
"on demand" has to be in the initial parse of the document, or
you end up with tons of situations where you have false positives
and negatives.
You might be able to go halfway and not parse Y if X is well-formed
and simply assume that Y is well-formed as well. However, when
someone initiates a transform (such as converting DocBook to HTML
or PDF), suddenly any hidden errors in Y are found.
You may want to look at the work William Will of XEMO is doing with
the MDR - he is using it precisely to manage large XML files (a
symphony in XML may be 200Mb) and index and traverse the tree in
ways that XML parsing is not efficient for. That seems like what
could help here, though it probably also requires wholesale
rearchitecting of TAX.
Clarification: Example should be: XML File x: <chapter> <section> & Entity y </chapter> Entity y: Some random text.... </section> Tim, even though you example is not correct ("the logical and physical structures in an XML document are properly nested" - http://www.w3.org/TR/REC-xml#wf-entities) I see problem if x document is properly nested and y is not. You are right we should think about it. Question is if it is necessary to load all entity references to memory. There could be simple task on background which just check if the document is well formed and add error badge on XML node icon. This could be independent on XML Tree Editor, so if you disable XML Tree Editor module you can still see error icon with wrong XML documents. IMHO the tree editor should only check that the X-document (master) is well-formed. The ref to the Y-document (child) should not be parsed by default, unless the user expands it - in which case it would be like expanding that actual Y.ent file in the Explorer, i.e. changes are applied to Y.ent and may later be saved (if there is a Y.ent in mounted filesystems, else consider it read-only). 3.4_WAIVER: architecture changes are necessary (no simple fix is possible). suggested readme entry: Large and complex XML files open very slowly in the tree editor. Workaround: Open such files with the Edit command to edit them in the text editor instead of the tree editor. I agree with waiver for 3.4 I agree with the waiver. Leaving my name on the CC for relnote purposes. Version was changed on 'S1S 4.2'. I order to fix this bug, I plan to rewrite TreeDocumentType a little bit. Proposed implementation should share DTD model among all document models referencing it by the same ID. It means that I'll remove parent-child pairing as there is not unique one. It works well for read-only (no firing) subtrees. DTD entity model subtree is read-only even in current implementation. Old model: TreeDocumentType -> DTD children list New model: TreeDocumentType -> DTD ID a global soft map<DTD_ID, DTD children list> I fixed the most common case when DTD size affects actual XML document using it. It shares DTD subtrees so it's not neccessary to reparse it for subsequentent usage. It's not exactly your case, it's just a part of it. Remaining part should be adressed by issue #22789 [LATER]. removing RELNOTE keyword as this does not seem to apply to 3.5. Doesn't it apply to 3.5? What if you open a file with many included entities and then try to use XML code completion - does it try to load them all? Code completion uses background thread. It must read all parameter entities. AFAIK RELNOTE keyword was targeted to the Visual XML editor module that exposed TAX deficiency in GUI thread. Yes, the RELNOTE was about performance in the visual editor. If there are any other visible performance issues please let me know and put keyword back in. and, for whatever it's worth, the removal of the visual XML editor (and the reason and the fact that it's available in the update center) will be part of the release notes Nevada relnote note: XML visual editor publishing at AU center is not driven by Sun. Fixed as in it's really fast, or fixed as in the XML editor is no longer in the production build? If I open the O'Reilly book now in it, will I have a nice experience? For Visual XMl editor see issue #31656. VERIFIED Removing #34223 blocker. It's already in prj40_prototype codebase. *** Issue 31656 has been marked as a duplicate of this issue. *** what kind of forum is this? not giving any kind of solution for our problem???? we are working on the same problem since a week and still not getting the problem.. we get the error of "outofmemory"... the first worst thing about netbeans is that it takes eons to startup its modules and then works too slow giving errors of out of memory... we have a 2GB RAM with 360GB hard disk and still the same problem... do we have to buy a super computer and make netbeans work on it?? we are plannin to change the IDE if the same problem persists... netbeans is really the worst IDE... sorry, but before changing the IDE would like to have a solution to memory problem if you can suggest I filed this bug in 2002 about visual XML tree editor that was dropped from NetBeans five years ago. Why is a bug in a component that has not been part of NetBeans since 2003 being reopened? |