VOOZH about

URL: https://www.geeksforgeeks.org/dsa/what-is-h-index/

⇱ H-Index - GeeksforGeeks


  • Courses
  • Tutorials
  • Interview Prep

H-Index

Last Updated : 8 Aug, 2025

Given an array citations[] of size n such that citations[i] is the number of citations a researcher received for ith paper, find the H-index.

H-index(H) is the largest value such that the researcher has published at least H papers that have been cited at least H times.

'H' stands for Hirsch index as it was proposed by the J.E. Hirsch in 2005. The H-index is defined as the author-level metric that attempts to measure both the productivity and the citation impact of the publication of the scientist or the scholar.

Examples:

Input: citations[] = [5, 0, 2, 0, 2]
Output: 2
Explanation: We can see that there are at least 2 papers whose citation count is 2 or more. In this case, the papers with 5, 2, and 2 citations qualify, and since at least 2 such papers exist, the H-index is 2.

Input: citations[] = [6, 0, 3, 5, 3]
Output: 3
Explanation: Here, there are at least 3 papers that have been cited 3 or more times. The papers with 6, 5, 3, and 3 citations meet this condition. Since at least 3 such papers exist, the H-index is 3.

[Naive Approach] Using Comparison-based Sorting - O(n Γ— logn) Time and O(1) Space

The idea is to sort the citation counts in descending order so that the most cited papers appear first.
Then, by scanning from the highest to the lowest citation count, we find the largest index where the number of papers considered so far is less than or equal to their citation count β€” this value becomes the H-index.

Step by Step Approach:

  • Sort the citations[] array in descending order so that papers with the highest citations come first.
  • Initialize a counter idx = 0 to track how many papers meet the H-index condition.
  • Iterate through the sorted array from left to right:
    => If citations[idx] > idx, it means all papers from index 0 to idx have at least (idx + 1) citations.
    => In this case, increment idx and continue checking the next paper.
  • Stop when either:
    => We reach the end of the array, or
    => We find a paper where citations[idx] <= idx.
  • The final value of idx is the H-index.

Output
3

[Expected Approach] Using Counting Sort - O(n) Time and O(n) Space

The idea is similar to Counting Sort and maintain an array of size n + 1, say freq[] to count the frequency of citations of each paper. So, freq[i] will store the number of papers having i citations. To handle the case when number of citations > n, we can simply replace them with n. This is because the value of H-Index can be at most n as we have total n papers only.

After counting the frequency of citations, start from last index n and for each index i, keep track of the count of papers having at least i citations. If at any index i, the count of papers having at least i citations becomes >= i, then we can simply return i as the H-Index. We started from the last index n because we want to maximize the H-Index.

Step by Step Approach:

  • Get the number of papers β†’ Store n = citations.size().
  • Create frequency array β†’ freq of size n + 1 to store how many papers have a specific citation count (capped at n).
  • Count citation frequencies
    => If citations[i] >= n, increment freq[n] (treat as β€œn or more citations”).
    => Else, increment freq[citations[i]].
  • Start from maximum possible H-index β†’ Set idx = n.
  • Initialize count of qualifying papers β†’ s = freq[n] (papers with β‰₯ n citations).
  • Reduce idx until condition met β†’ While s < idx:
    =>Decrement idx.
    => Add freq[idx] to s (include papers with exactly idx citations).
  • Return H-index β†’ Once s >= idx, return idx as the H-index.

Working:


Output
3
Comment
Article Tags:
Article Tags: