This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.
Summary: | Analyzing project doesn't scale for large projects | ||
---|---|---|---|
Product: | cnd | Reporter: | rmartins <rmartins> |
Component: | Code Model | Assignee: | Alexander Simon <alexvsimon> |
Status: | RESOLVED FIXED | ||
Severity: | blocker | Keywords: | PERFORMANCE |
Priority: | P2 | ||
Version: | 6.x | ||
Hardware: | All | ||
OS: | Linux | ||
Issue Type: | DEFECT | Exception Reporter: | |
Attachments: |
Thread dump while openning project
Thread dump while analyzing project Thread dump while parsing project |
Description
rmartins
2009-03-03 20:12:42 UTC
Could you provide us a thread damp at parsing time? Please check you JVM parameters. CND has a property that rewrite default number of threads: -J-Dcnd.modelimpl.parser.threads=4 By default it is a number of cores. Could you specify this parameter and try NB again and take a thread dump? Created attachment 77694 [details]
Thread dump while openning project
Created attachment 77695 [details]
Thread dump while analyzing project
Created attachment 77696 [details]
Thread dump while parsing project
I used the -J-Dcnd.modelimpl.parser.threads=4 param. It starts to parse but fails, outputting that code completion isn't available. One of the problem was fixed: - IDE analyzes project in several threads (if computer has several cores). Partly was fixed problem with memory consumption on project items: - removed duplicated items - remove redundant fields But memory is steel too large (about 100Mb on ACE) P2 because ACE cannot be open in NB IDE. Hi Alex, I have been able to open the separate ACE framework project (and parse) with about 2Gb~2.5Gb of memory (-J-Xmx2048m) (I left it overnight to index...). With ACE_TAO, I was able to index with 3.5Gb (also left it overnight). Did you commit any for your enhancements to the golden repo? I still only have 1 core being utilized... Hi, The fixes are available in today's official nightly build. Memory consumption should be reduced + analyzing project job is using all available CPUs Btw, you can always access last successful C++ build in Build Artifacts at http://bertram.netbeans.org/hudson/job/cnd-main/lastSuccessfulBuild/ Thanks, Vladimir. P.S. Have you already signed up for NetCAT 6.7 program? :-) http://wiki.netbeans.org/NetCAT67Participants Hi, I will join the NetCAT 6.7. I hope I can contribute to your great effort. Keep the good work! P.S.: thanks for the info about the c++ Build Artifacts Current nightly build should have following performance:
- project contains 14315 source and header files
- initial parse consumes about 90 minutes of CPU time (physical time is about 30 minutes)
- parsing in 1700Mb xms memory
Computer:
- memory 4G
- CPU speed 3000
- number of cores 4
It is a better but not enough ;-)
The ACE is a challenge for us.
>I still only have 1 core being utilized...
- IDE has some threads on analyzing and parsing time
- IDE read project (reopen) in one thread.
Ok, I will have to wait for a good build on Build Artifacts or wait for the nightly build;) The project shows a clear and steady progress, for me that is the most important thing. ;) As soon I manage to test it, I will post about my experience. Thanks, Rolando hi, I am testing the NetBeans-dev-cnd-main-47-on-090319-full and detected a strange behavior. Still using ACE_TAO, I detected that after a while the parsing stops using the 4 cores and starts using only one. The amount required for this to happens is directly linked to the amount of memory you gave the VM. When I used 2Gb I manage to get the 4 cores working until 20% (parsed project), when I switched to 1.5Gb the same behavior happen when the parsing reached about 10%. Hope it helps. Rolando Thanks for reporting problem. This is a "post parsing project task". It was parallelized in change set: http://hg.netbeans.org/cnd-main?cmd=changeset;node=cd40f67846f4 Congrats Alex, it's way better;) using 2Gb intel q9450@3.2Ghz: Now I can parse the ACE framework without any issue, 2-3 minutes walltime (and it uses the 4 cores). http://download.dre.vanderbilt.edu/previous_versions/ACE-5.6.8.tar.bz2 ACE & TAO project I get an error about completion failed. http://download.dre.vanderbilt.edu/previous_versions/ACE+TAO-5.6.8.tar.bz2 Hi Alex, is the opening of a project parallelized? When comparing to the other tasks it seems slow... What comprehends the "open project" task? It scans for checking all files of a project? Thanks, Rolando >What comprehends the "open project" task?
- convert paths stored in "configuration.xml" in project items
It includes following time consuming operations:
- convert path to canonical file
- detect file MIME type
- create file object by file (class FileObject)
- create data object by file object (class DataObject)
Project ACE+TAO contains a lot of unnecessary files.
For example:
- number of c/c++/header files: ~14K
- number of other (*.sln, *.bmak, *.vcproj, ...) files: ~44K
So problem in 44K files.
Hi Alex, can you do it in parallel? Is there any way to have a ignore extensions list (*.sln, *.bmak, *.vcproj, ...), and simply bypass them (perhaps you already have done this...;) ) Rolando >Is there any way to have a ignore extensions list (*.sln, *.bmak, *.vcproj, ...) Yes,Tools->Options->Miscellaneous->Files. Modify "Ignore Files pattern". By default CND adds following ignore pattern: ".*\\.(o|lo|la|Po|Plo)$" Adding other extension is user responsibility. >can you do it in parallel? I can but it is error prone and I prefer not to do it. Alex,
thanks for the info. I going to had the extensions to the ignore list.
>can you do it in parallel?
>I can but it is error prone and I prefer not to do it.
Ok, sorry for insistence on this, but and if you split the directory tree?
Root
-----|-----
| |
ACE TAO
One thread for ACE and other for TAO (what I meant to say is, split the top directory tree among the available cores).
Does this avoid the errros you mention?
Rolando
Alex, I have noticed that some .h, .cpp (and .inl, I added the inl extension as a header type) are greyed, but they are used and if I open the file, they were correctly parse and doesn't seem to have any issue. Ex:ace/Assert.h (I didn't change the ignore list) What do you think? Rolando Alex, I have the following errors with ACE_TAO (I have updated the ignore file list). When finish creating the project, I have "completion failed": 0 out of 6,597 source files have limited code assistance 0 out of 6,858 header files have limited code assistance Do you have any clues why this happens? Having: 1. mkdir build (in ACE_TAO main directory) 2. cd build 3. ../configure --enable-ace-reactor-notification-queue I add manually the include and macro files. include: ACE_TAO/ ACE_TAO/TAO ACE_TAO/TAO/orbsvcs macro: ACE_TAO/build/ace/config.h ACE_TAO/build/TAO/tao/config.h I don't know if it's because of the code completion error, but when I try to find usages: ACE_TAO/TAO/tao/RTCORBA/Thread_Pool.h TAO_Thread_Pool_Threads::run It's takes a very long time to find the usages... >When finish creating the project, I have "completion failed": - it mean that code assistance still has a failed include directives. You can see it in project popup menu "Code Assistance->Failed include directives" Causes: - not enough information in object files - bugs in analyze algorithm - bugs in code model If "failed include" dialog show a few unresolved directives on huge ACE, IMHO it is good enough. You can investigate why code model has the filed include directives. >It's takes a very long time to find the usages... Find usages consists from two steps: - grep all source/header files for finding ID - visit all references in selected files So if you are finding a short ID, it is possible that there are many files will be involved into visiting references. Also a first step can take a lot of time. >>It's takes a very long time to find the usages... >Find usages consists from two steps: >- grep all source/header files for finding ID What does the initial parsing of project files? I thought that in this step a "complete" AST was created. >- visit all references in selected files Does the concept of "indexer" exists? Where you can query your a declaration and find the possible bindings. >So if you are finding a short ID, it is possible that there are many files will be involved into visiting references. >Also a first step can take a lot of time. Hi Alex(, Vladimir), the process of opening the ACE_TAO is very slow (I have excluded the .sln, etc). Despite the parsing is still a bit slow and uses a lot of memory, the opening seems to me the immediate bottleneck. Rolando >the process of opening the ACE_TAO is very slow
Rolando, please file a separate issue for this.
Hi Alex & Vladimir, I read this issue - http://www.netbeans.org/issues/show_bug.cgi?id=134990, and found this idea very interesting: "Creating some module that can collect performance data is great idea! Thanks. I will ask here if we have something like this." Can this be done for CND? This also be helpful for the find usages issue... Rolando The issue covers so many different things (parsing, find usages, opening project) that it needs to be broken down into smaller parts. So, Alexander, Rolando, please go ahead and file separate issues. I would assume they are going to be P3s since it reasonably expected to be slow on huge projects. However we are committed to work on performance issues and make CND capable for very large projects. As for now I'm closing this IZ as FIXED, since a lot of improvements have been implemented by Alexander. Hi, just for not double issuing, the find usages and opening project already have there own issue report, so I will create a new one for the parsing/memory consumption. I really appreciate your effort, I am doing research work and the capability for handling large projects it is a must for me. Thanks, Rolando Hi, Here are the separate issues: Non scalable project opening - http://www.netbeans.org/issues/show_bug.cgi?id=161455 [code model] Parsing large projects is too slow & consumes too much memory - http://www.netbeans.org/issues/show_bug.cgi?id=162401 [code model] Non scalable "find usages" for small keywords - http://www.netbeans.org/issues/show_bug.cgi?id=161456 Rolando |