tdpencil's blog

By tdpencil, history, 3 years ago, In English

Hey Codeforces!

I've been wondering this for a long time, but could never find a straight answer.

According to String substr on cplusplus.com, the complexity of substring is mostly linear but is unspecified. I thought this was interesting, but never thought much of it. After all, a string is just an array of characters?

I thought this at first until I found another post that claimed that C++ string was implemented with the Rope data structure.

So i tried the following code (on my Linux system).

#include <bits/stdc++.h>
using namespace std;


int main() {
	int t; cin >> t;
	string s=string(int(2e5), 'a');

	while(t--) {
		string v = s.substr(0);
	}
}

Surprisingly, this code took less than a second after entering t as 200000.

Now, I tried a true N^2 approach to test my code against the one above.

#include <bits/stdc++.h>
using namespace std;


int main() {
	int t; cin >> t;
	

	while(t--) {
		for(int i = 0; i <= 1e5; i++) {
			continue;
		}
	}
}

After entering t as 200000, the program takes 15 seconds (without compiler optimizations).

So, which one is correct — Rope or Char Array?

Thanks in advance.

  • Vote: I like it
  • +14
  • Vote: I do not like it

»
3 years ago, # |
  Vote: I like it 0 Vote: I do not like it

I guess underlying std::string is char*, because std::string::data returns char*

  • »
    »
    3 years ago, # ^ |
      Vote: I like it +5 Vote: I do not like it

    Sure, but what explains the enormous time difference? A rope joins substrings in log N, which is why I was trying to differentiate.

»
3 years ago, # |
  Vote: I like it +41 Vote: I do not like it

Benchmarks like this have a big problem with the compiler being smart enough to figure out you're not doing anything, so they end up deleting the code entirely. You can check out the assembly at https://godbolt.org/z/WajTsvcWr; in this case, no string construction is happening inside the loop.