depends on your OS, for windows i suggest to use visual studio, or if you want to analyze while your process is running you can try ProcessExplorer to get the high cpu thread and a part of the stack, but it depends much on the fact if you have one hot path for cpu usage or a combination of multiple problems. the second one is much harder to find.
but for clarification, those are giving you only some hints, you need to know how the code works (into deep) to get to the right code place.
(or compile in complete debug with debugger attached and performance session running, but this will make your core at least 10 times slower and laggy)
for other OS i can't give you a suggestion
EDIT: Are you running the old vmaps, or have you updated this core to newer vmaps (maybe including mmaps)?