ID:	6683	Fixed in:
Issue Date:	2013-07-09 08:01 AEST	Owner:	CVS Support
Last Modified:	2017-05-18 14:43 AEST	Reporter:	Glen Starrett
Current Est.	0.0 hours	% Complete:	0.0
Status:	NEW /	Severity:	enhancement

Affected:	2.8.02
Description:	enh: server: rlog: --just-unique-tags

Actions:	2013-07-09 08:01 AEST by Glen Starrett - Add new option for rlog to output just unique tags. To be used in place of sending back full output of rlog -h and then parsing it out on the client, for the benefit of very large repositories. The TortoiseCVS function "refresh list" (for a list of available tags) with the 'search subfolders' option checked runs "cvs ... rlog -h MODULE" then parses that output for tag names, then sorts them to get unique entries. The additional processing takes approximately 2 orders of magnitude longer than simply running the command at a command line (e.g. 9 minutes in TCVS vs 12 seconds at the command line for a large-ish number of files, or 5 hours vs 10 minutes for a very large repository). Moving this processing back to the server should improve the performance greatly: reduce the amount of data sent back to the client and reduce processing on the client.
	2013-07-09 08:05 AEST by Glen Starrett - Created an attachment (id=2645) Log files and test data I spent some time (wow... 2 hours) this morning collecting data and traces on this, but I'm pretty much at the conclusion that he's got a TON of data and should use that "update list" button very, very sparingly (or switch to EVS). The output from the rlog command that I'm using is just shy of a million lines (978,974). From the command line it takes ~16 seconds to save it off to a file, or 206 seconds to output it to the screen. I thought about that (after doing other stuff) and timed the time to just "type rlog-out.txt" to the console and that covers about 90% of the difference in those times. In other words, it just takes time to output that much data. Now, Tortoise isn't outputting it. It's scanning it for tags -- so it should be more efficient than output to console, but not as good as redirected to file. It isn't in that range, it's actually over the output to console timing. My guess is that it's scanning method isn't as efficient as it could be and the list storage is POOR (I think it takes disproportionally longer as the list size grows, which is why the customer is seeing 32:1 vs my test 16:1 TCVS to CLI to file time ratios). That's my guess anyway. What do you think? Should I increase the size of my test set to correspond more closely to the customer's timing to see if the ratio holds true vs. increases (and if so I need a better way to time the TCVS "update list" window because watching it is TEDIOUS). Notes follow, traces attached. The trace files correspond to the numbers next to the "trace" entries in the notes below. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> MH: Picking up again on the slow performance of TCVS 'fetch list' (rlog command). Running on the VM (just 1 pass each, I'm not trying to establish statistical regularity here, but I did throw away the first to allow the repository to be somewhat loaded in cache). Working with a output length of 978,974 lines. CLI Redirect to file 16.3 sec CLI Redirected to file (no focus on VM) 15.6 sec CLI to screen 206.5 sec CLI traced (to file) 2412 & 2624 15.8 sec CVS Suite TortoiseCVS regular 258 sec CVS Suite TortoiseCVS traced 2440 & 1672 655 sec OSS TortoiseCVS regular 220 sec Start 7:42:00am end 7:46:18 >> 4min 18s >> 460+18=258 Start 7:59:00am end 8:09:55am >> 10 55 >> 1060+55=655 NOTE: TCVS is also using the "-x" option on the command, which encrypts all net traffic. I haven't set that option in the prefs (it's at default values) so I've no idea why that is on but it's not included in my CLI tests so I need to re-check those to see if that affects the timing. CLI Redirected to file + encrypted + traced 17.28s CLI Redirected to file + encrypted 19.08s, 17.5s CLI to screen + encrypted 214.9s >> Doesn't seem significant when compared to above. Uninstalling CVS Suite / Installing OSS CVS Suite >> Saved VM state with bookmark, reverted to install OSS TCVS. Installing TortoiseCVS 1.12.5. Start 8:36:00am end 8:39:40am >> 3min 40 sec >> 360+40=220 sec The results seem to be more tied to "doing something on client end" than straight output. I'm wondering if the I run the output to file, then do a type on the file, if that will have a total that looks like the 'output to screen' results. E.g.: Cvs -d … rlog mingw > log.txt && type log.txt Does that equal the "CLI to screen" thing? Type rlog-out.txt 161.7s Total for CLI to file + type output 16+161.7=177.7 Difference in totals 177.7-206.5=-28.8 >> Looks like MOST of the difference in time is due to straight console output windows delays. TCVS to CLI ratio in TEST: 258/16=16.125 Customer tests are 5-10 min at CLI and 5.25 hours with TCVS. 5.25 hours to seconds: 5.2560*60=18,900 TCVS to CLI ratio by Customer: 18900/600=31.5 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
	2013-07-10 01:52 AEST by Glen Starrett - Created an attachment (id=2646) TortoiseCVS log for test data, part 1
	2013-07-10 01:53 AEST by Glen Starrett - Created an attachment (id=2647) TortoiseCVS log for test data, part 2
	2013-07-10 01:54 AEST by Glen Starrett - Created an attachment (id=2648) TortoiseCVS log for test data, part 3

Query page