This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 22152 - Opening file w/ large DTD and lots of other files linked using entities takes 25 minutes!
Summary: Opening file w/ large DTD and lots of other files linked using entities takes...
Status: VERIFIED FIXED
Alias: None
Product: xml
Classification: Unclassified
Component: TAX/Lib (show other bugs)
Version: -S1S-
Hardware: PC Windows XP
: P2 blocker with 1 vote (vote)
Assignee: issues@xml
URL:
Keywords: PERFORMANCE
: 31656 (view as bug list)
Depends on: 22789
Blocks: 27998
  Show dependency tree
 
Reported: 2002-04-05 23:07 UTC by _ tboudreau
Modified: 2008-02-17 23:28 UTC (History)
7 users (show)

See Also:
Issue Type: DEFECT
Exception Reporter:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description _ tboudreau 2002-04-05 23:07:23 UTC
I'm using the XML editor to edit some DocBook sources - 
the O'Reilly book on NetBeans.  The entire source base is 
1.1 Mb in size; the DocBook entity catalog is about 60K.

Click the master document to open it.  Nothing appears to 
be happening.  During this time, attempting to open the 
document tree results in a Please Wait icon.  25 minutes 
(!!!) later, the structural editor mysteriously appears.  

2 suggestions:
1.  Make sure the user has some indication of what's going 
on.

2.  Do anything possible to improve the performance.
Comment 1 _ pkuzel 2002-04-08 09:45:01 UTC
Tim please turn off tree editor module to improve performance.

Tree editor module is know performance issue mentioned in FAQs.
Comment 2 _ lkramolis 2002-04-25 18:38:44 UTC
Performance is terrible - P2.
Comment 3 _ tboudreau 2002-07-19 19:22:36 UTC
Any progress on this issue?
Comment 4 Marek Grummich 2002-07-22 12:04:37 UTC
Set target milestone to TBD
Comment 5 Jesse Glick 2002-07-25 17:35:13 UTC
Users ought to be warned about this. No one would want to expand a big
XML file.
Comment 6 _ pkuzel 2002-07-25 19:48:38 UTC
It is also noted in module help.

Comment 7 Jaroslav Tulach 2002-07-26 07:58:32 UTC
This bug is reported in version <= 3.4dev and still not fixed. Due to that it
forbids the release candidate for 3.4 to be promoted. Are you aware of that and
are you intensively working on the fix? If not, you should consider some
corrective action.
Comment 8 _ lkramolis 2002-07-26 09:00:22 UTC
XML model (TAX) is not designed for large document -- then I could say
it is as designed behaviour, however I do not agree with it.

Other work on TAX was for 3.4 postponed, large files tree editing is
not our strategic feature. One important thing was fixed: referenced
TAX instances are correctly weak - not used references can be released
by GC.

[Reverting priority back to original P3.]

Possible solution: external entities should be parsed (and built TAX)
just on demand.
Comment 9 _ pkuzel 2002-07-26 10:36:54 UTC
We estimated that proper fix implementation requires
3 developer/weeks (rewrote parser and introduce positions).

Based on this estime the strategy commite marked XML performance
as non strategic issue for Sun's contribution to NetBeans.

From Sun's point of view it was classified as nice to have enhancement.
From user's point of view it is a bug. 
I do not know how to express this state using IZ.
Comment 10 Jesse Glick 2002-07-29 02:18:31 UTC
I still consider this a P2 DEFECT. The fact that no Sun employee plans
to work on it for 3.4 is irrelevant; it is a serious performance bug.
Comment 11 _ tboudreau 2002-07-29 05:56:41 UTC
+1

If we say we have *XML editing* capability, this sets expectations
that you will be able to edit *any* XML file.  If we want to call it
"Small XML file editing" that's another story.  It is better that 
this be a known bug slated to be fixed than to get complaints about
what happens when users open XML files and NetBeans hangs for half 
an hour and have no decent answer for why this is not a priority (I 
managed to take a shower and cook and eat breakfast in 
the time it took NetBeans to parse the O'Reilly book XML).


I'm taking the initiative (may the gods curse me) to raise the
priority to P2.  The proper process for this bug is to get a waiver
for it for 3.4.
Comment 12 _ tboudreau 2002-07-29 06:12:47 UTC
> Possible solution: external entities should be parsed (and 
> built TAX) just on demand.

Question 1:  Was this the solution you evaluated when coming up with 
the 3-developer-weeks estimate?

Question 2:  How will you determine if an external entity contains
valid XML if you don't parse it?  Hypothetical ugly docbook example:

XML File x:
<chapter>
  <section>
   & Entity 1
</chapter>

Entity y:
Some random text....
</section>

In this case, x is malformed XML unless you parse the document, and
y is malformed XML unless parsed in the context of the document.
I could be wrong, but I don't think this sort of ugliness is
illegal in XML.  Also consider the case where x contains the 
closing section tags, but y is malformed: as far as I can tell, the
"on demand" has to be in the initial parse of the document, or 
you end up with tons of situations where you have false positives
and negatives.

You might be able to go halfway and not parse Y if X is well-formed
and simply assume that Y is well-formed as well.  However, when 
someone initiates a transform (such as converting DocBook to HTML
or PDF), suddenly any hidden errors in Y are found.

You may want to look at the work William Will of XEMO is doing with
the MDR - he is using it precisely to manage large XML files (a 
symphony in XML may be 200Mb) and index and traverse the tree in 
ways that XML parsing is not efficient for.  That seems like what
could help here, though it probably also requires wholesale 
rearchitecting of TAX.
Comment 13 _ tboudreau 2002-07-29 06:20:37 UTC
Clarification:  Example should be:
XML File x:
<chapter>
  <section>
   & Entity y
</chapter>

Entity y:
Some random text....
</section>
Comment 14 _ lkramolis 2002-07-29 09:25:41 UTC
Tim, even though you example is not correct ("the logical and physical
structures in an XML document are properly nested" -
http://www.w3.org/TR/REC-xml#wf-entities) I see problem if x document
is properly nested and y is not. You are right we should think about it.

Question is if it is necessary to load all entity references to
memory. There could be simple task on background which just check if
the document is well formed and add error badge on XML node icon. This
could be independent on XML Tree Editor, so if you disable XML Tree
Editor module you can still see error icon with wrong XML documents.
Comment 15 Jesse Glick 2002-07-29 15:20:22 UTC
IMHO the tree editor should only check that the X-document (master) is
well-formed. The ref to the Y-document (child) should not be parsed by
default, unless the user expands it - in which case it would be like
expanding that actual Y.ent file in the Explorer, i.e. changes are
applied to Y.ent and may later be saved (if there is a Y.ent in
mounted filesystems, else consider it read-only).
Comment 16 _ lkramolis 2002-07-29 17:27:34 UTC
3.4_WAIVER: architecture changes are necessary (no simple fix is possible).
Comment 17 Patrick Keegan 2002-07-29 22:08:43 UTC
suggested readme entry: Large and complex XML files open very slowly 
in the tree editor. Workaround: Open such files with the Edit command 
to edit them in the text editor instead of the tree editor. 
Comment 18 iformanek 2002-07-30 16:49:11 UTC
I agree with waiver for 3.4
Comment 19 Patrick Keegan 2002-07-30 17:36:32 UTC
I agree with the waiver. Leaving my name on the CC for relnote 
purposes.
Comment 20 Martin Schovanek 2003-01-06 13:03:04 UTC
Version was changed on 'S1S 4.2'.
Comment 21 _ pkuzel 2003-02-13 15:38:55 UTC
I order to fix this bug, I plan to rewrite TreeDocumentType a little
bit. Proposed implementation should share DTD model among all document
models referencing it by the same ID. It means that I'll remove
parent-child pairing as there is not unique one. It works well for
read-only (no firing) subtrees. DTD entity model subtree is read-only
even in current implementation.

Old model:
TreeDocumentType -> DTD children list

New model:
TreeDocumentType -> DTD ID
a global soft map<DTD_ID, DTD children list>
Comment 22 _ pkuzel 2003-02-14 13:41:30 UTC
I fixed the most common case when DTD size affects actual XML document
using it. It shares DTD subtrees so it's not neccessary to reparse it
for subsequentent usage.

It's not exactly your case, it's just a part of it. Remaining part
should be adressed by issue #22789 [LATER].
Comment 23 Patrick Keegan 2003-03-03 23:02:16 UTC
removing RELNOTE keyword as this does not seem to apply to 3.5.
Comment 24 Jesse Glick 2003-03-04 04:11:58 UTC
Doesn't it apply to 3.5? What if you open a file with many included
entities and then try to use XML code completion - does it try to load
them all?
Comment 25 _ pkuzel 2003-03-04 08:10:20 UTC
Code completion uses background thread. It must read all parameter
entities. AFAIK RELNOTE keyword was targeted to the Visual XML editor
module that exposed TAX deficiency in GUI thread.
Comment 26 John Jullion-ceccarelli 2003-03-04 09:59:52 UTC
Yes, the RELNOTE was about performance in the visual editor. If there
are any other visible performance issues please let me know and put
keyword back in.
Comment 27 Patrick Keegan 2003-03-04 10:13:19 UTC
and, for whatever it's worth, the removal of the visual XML editor (and the reason and the fact that 
it's available in the update center) will be part of the release notes
Comment 28 _ pkuzel 2003-03-04 13:11:25 UTC
Nevada relnote note: XML visual editor publishing at AU center is not
driven by Sun.
Comment 29 _ tboudreau 2003-03-04 18:00:19 UTC
Fixed as in it's really fast, or fixed as in the XML editor is
no longer in the production build?  If I open the O'Reilly book
now in it, will I have a nice experience?

Comment 30 _ pkuzel 2003-03-04 18:15:13 UTC
For Visual XMl editor see issue #31656.
Comment 31 Martin Schovanek 2003-03-13 13:51:09 UTC
VERIFIED
Comment 32 _ pkuzel 2003-06-16 13:12:53 UTC
Removing #34223 blocker. It's already in prj40_prototype codebase.
Comment 33 Mikhail Matveev 2008-02-15 16:45:00 UTC
*** Issue 31656 has been marked as a duplicate of this issue. ***
Comment 34 mprpatel 2008-02-17 11:15:38 UTC
what kind of forum is this? not giving any kind of solution for our problem???? we are working on the same problem 
since a week and still not getting the problem.. we get the error of "outofmemory"... the first worst thing about 
netbeans is that it takes eons to startup its modules and then works too slow giving errors of out of memory... we 
have a 2GB RAM with 360GB hard disk and still the same problem... do we have to buy a super computer and make netbeans 
work on it?? we are plannin to change the IDE if the same problem persists... netbeans is really the worst IDE... 
sorry, but before changing the IDE would like to have a solution to memory problem if you can suggest
Comment 35 _ tboudreau 2008-02-17 23:28:55 UTC
I filed this bug in 2002 about visual XML tree editor that was dropped from NetBeans five years ago.  Why is a bug in a component that has not been part of 
NetBeans since 2003 being reopened?