Many researchers have unpublished data. Some of this data may never be published as a manuscript. But I would like to make scholarly contributions of data that I have no intent on publishing, e.g. by publishing a "data paper"
The term "data paper" may be too new to be familiar, so here is a description from the Ecological Archives website:
Data Papers are compilations and syntheses of data sets and associated metadata deemed to be of significant interest to the ESA membership and the scholarly community. Data papers are peer reviewed and are announced in abstract form in the appropriate print journal as a Data Paper. Data papers differ from review or synthesis papers published in other ESA journals in that data papers normally will not test or refine ecological theory. Data Papers can facilitate the rapid advancement of ecological knowledge and theory at the same time that they disseminate information. In addition, Ecological Archives provides a reward mechanism (in the form of peer-reviewed, citable objects) for the substantial effort required to compile and adequately document large data sets of ecological interest
This brings up the following questions:
What makes a good data repository?
Which data repositories provide a doi: for raw data?
Should published data be separate from articles on a CV?
There are a few things that I would consider when choosing a data repository:
- Does it let you release your data under a license you're happy with?
- Applying too restrictive a license can prevent anyone from doing anything useful with the data, so think about what you're prepared to allow. In particular, remember that most of the research done in academia could be considered "commercial" from a legal perspective. On the other hand, you may wish to choose a license that ensures you get credit for your work. You may or may not agree with them, but reading the Panton Principles will give you some idea of the issues here. Also take a look at this list of licenses written with data in mind
- How easy will your data be to find?
- People will only use your data if they can find it. I recommend Googling (other search engines are available) for some datasets you know of in your field and see if they come up — those repositories which are indexed by the major search engines will put you at a big advantage when it comes to attracting citations.
- What repositories are well known in your field?
- Your institution may have a repository which you can easily deposit in, but it won't be the first place colleagues in your field will think of to look. If there are well-established repositories I would prefer those, or make sure your data is indexed by a well-established aggregator (I know ANDS runs a national aggregator in Australia).
- What does your institution allow?
- In many cases, your institution will own (or otherwise have a claim to) the data you generate as part of your research, so check what your local policies are and if need be ask your supervisor, head of department, legal team, etc. This will particularly affect your choice of license.
The other parts of your question can probably be answered better by others here (or maybe it should be split into several?)